Expert AI Training & RLHF Evaluator
Scope & Tasks: Conducted multi-turn RLHF (Reinforcement Learning from Human Feedback) evaluations for frontier LLMs. The project focused on ranking model responses based on truthfulness, reasoning depth, and instruction-following. Specific Activities: Comparative Analysis: Ranked dual model outputs by identifying subtle hallucinations and logical fallacies. Creative & Technical Writing: Authored high-quality prompts and model responses to establish "gold standard" training data. Adversarial Testing: Performed red-teaming to stress-test model safety and alignment with global ethical guidelines. Linguistic Precision: Performed complex tasks in both English and Brazilian Portuguese, ensuring cultural nuances and grammar were flawless. Quality Measures: Adhered to strict 100+ page guideline documents, maintaining a high acceptance rate through detailed, evidence-based rationales for every evaluation.