AI Trainer & Model Evaluation Specialist
As an AI Trainer & Model Evaluation Specialist, I contributed to RLHF workflows for production LLM systems. My tasks included dataset curation, prompt design, and creation of evaluation benchmarks, ranking model outputs for diverse tasks while flagging issues in reasoning, bias, and safety. I also performed structured error analysis, producing feedback utilized in continuous model fine-tuning cycles. • Evaluated and ranked model responses across coding, reasoning, and instruction-following • Conducted comprehensive error analyses and flagged output issues • Created and improved evaluation benchmarks • Maintained strong quality scores and supported model alignment