LLM Evaluator
Performed structured text data labeling and LLM response evaluation according to strict project guidelines. Tasks included text categorization, evaluating AI-generated responses for relevance, factual accuracy, instruction adherence, and clarity, as well as ranking multiple outputs based on quality criteria. Identified hallucinations, inconsistencies, and guideline violations while maintaining consistency across high-volume annotation work.