AI Data Evaluator (Contract) — Appen
As an AI Data Evaluator at Appen, I performed dataset labeling and quality assurance for AI training pipelines. Responsibilities included evaluating large language model outputs for accuracy, relevance, and reasoning quality. I utilized structured evaluation guidelines and frameworks to rank and compare AI-generated responses. • Identified hallucinations and logical inconsistencies in model-generated text • Ensured compliance with detailed annotation guidelines • Contributed to the development of high-quality training datasets for LLMs • Supported quality assurance in the AI model evaluation process