AI Model Evaluation & Data Labeling Trainee (Outlier Platform)
Completed specialized AI model evaluation and data labeling training on the Outlier platform focused on large language model outputs. Evaluated AI-generated responses using structured rubrics for accuracy and consistency, providing evidence-based feedback. Performed safety reviews, red teaming, prompt engineering, model weakness identification, and agent-coding evaluation for continuous system improvement. • Applied rubric-based assessment and prompt engineering to test model reasoning and instruction following. • Conducted annotation and quality control of AI outputs, analyzing model failures and inconsistencies. • Reviewed and tested risky or policy-sensitive model behavior using safety and red-teaming methodologies. • Developed and evaluated agentic coding workflows focused on technical task execution and tool-use behavior.