AI Response Evaluation Analyst
Worked on an AI response evaluation and data quality training project focused on assessing and improving large language model outputs using structured rubrics. Responsibilities included reviewing user prompts, analyzing multiple AI-generated responses, and rating them across dimensions such as instruction following, relevance, completeness, factual accuracy, writing quality, and overall usefulness. Performed comparative evaluations using Likert-scale rankings, identified major and minor issues, and provided detailed justifications with actionable improvement suggestions. Also validated tool usage, code logic, and output correctness when applicable, ensuring high-quality human feedback to support model training and performance enhancement.