AI Data Labeler / LLM Evaluator (Contract)
As an AI Data Labeler and LLM Evaluator at Outlier.ai, I evaluated large language model outputs for mathematical reasoning and coding tasks. My work included validating step‑by‑step solutions, annotating logical errors, and generating or reviewing prompts and answers according to strict rubrics. I also completed coding evaluation tasks and performed structured time-tracked evaluations across multiple projects. • Validated mathematical reasoning and annotated logical errors (Hickory Project). • Generated, reviewed, and graded math prompts and solutions (Mechan Glen Project). • Performed Python coding tasks in Docker environments (SkillMaster Project). • Conducted structured AI evaluation tasks with detailed tracking (Aether Project).