AI Response Rater & Evaluation Specialist
As an AI Response Rater & Evaluation Specialist at Anthropic, I rated LLM-generated responses along accuracy, safety, and helpfulness dimensions using structured rubrics. My work included comparative preference ranking, detailed labeling for safety datasets, and ongoing contribution to annotation guideline refinement. I regularly provided written rationales and participated in calibration to ensure inter-annotator agreement. • Labeled sensitive, adversarial, and policy-compliance content for model safety datasets • Conducted 500–700 pairwise preference ratings per week with >95% audit pass rate • Submitted 80+ feedback reports leading to improved labeling guidelines • Participated in team calibration sessions; maintained Cohen’s Kappa >0.82