Leiden University
Bachelor of Science, Computer Science
Hire this AI Trainer
Sign in or create an account to invite AI Trainers to your job.
No subject matter listed
As an experienced data specialist with a focus on high-stakes model alignment, I specialize in generating and auditing high-quality datasets for Large Language Models (LLMs). My background is rooted in Reinforcement Learning from Human Feedback (RLHF), where I have moved beyond simple classification to handle complex multi-turn reasoning and preference ranking. I am highly proficient in applying "gold standard" rubric criteria to evaluate model outputs for factuality, tone, and logical consistency. By combining a meticulous eye for detail with a deep understanding of edge cases, I ensure that training data is not only accurate but also robust enough to handle real-world deployment challenges. My standout qualifications involve extensive work in Red Teaming and Cybersecurity data curation. I have led projects focused on adversarial prompting, where I intentionally crafted sophisticated "jailbreak" attempts to test model safety boundaries and labeled the resulting data to improve defensive filtering. My technical expertise includes: Vulnerability Mapping: Labeling and categorizing code-based threats, such as cross-site scripting (XSS) and SQL injection, to train models in secure coding practices. Adversarial Simulation: Developing complex social engineering scenarios to pressure-test a model's refusal logic and safety guardrails. Policy Alignment: Translating ambiguous safety guidelines into concrete, labeled examples that prevent the generation of harmful or biased content
Joao M. hasn’t added any AI Training or Data Labeling experience to their OpenTrain profile yet.
Bachelor of Science, Computer Science
VWO, Secondary Education - Natural Sciences
Cyber Defence Center Analyst
Teaching Assistant