AI Data Annotation & Evaluation (Freelance/Contract)
As a freelance AI Data Annotator & Evaluator, I contributed to large language model (LLM) training pipelines by annotating and evaluating software engineering datasets. My primary focus was on evaluating code quality and annotating task prompts for programming languages such as JavaScript, TypeScript, React, and Python. I ensured technical accuracy, detected unsafe outputs, and identified model hallucinations within annotated responses. • Annotated software engineering Q&A pairs to validate code generation, debugging, and system design prompts. • Reviewed and ranked AI-generated code solutions for correctness, efficiency, and best practice alignment. • Identified factual errors, hallucinations, and prompt injection attempts in LLM outputs as part of responsible AI evaluation. • Collaborated with platforms including Outlier AI, Mercor, and Mindrift to support frontier model training.