AI Evaluation Specialist / LLM Response Analyst
As an AI Evaluation Specialist / LLM Response Analyst at Mercor, I performed structured evaluations of AI-generated responses using rubric-based and side-by-side (SxS) frameworks. I analyzed model outputs for reasoning, correctness, and quality across diverse real-world scenarios. My work supported improvements in AI agent performance, particularly in specialized domains such as manufacturing. • Conducted high-volume, detailed side-by-side comparison of LLM responses. • Provided structured justifications to support reliability of evaluation scores. • Identified failure patterns and proposed areas for model improvement. • Applied knowledge in domain-specific content for manufacturing use-cases.