Senior AI Content Specialist (STEM/Reasoning), Anthropic (Contract via Scale AI)
Led the "Human-in-the-Loop" data labeling initiative for mathematical and STEM reasoning used in Claude 3.5 Sonnet. Developed and applied rubric scoring systems to evaluate output truthfulness and reduce hallucinations in model reasoning. Oversaw the creation of over 5,000 high-quality SFT trajectories within a high-priority labeling sprint. • Spearheaded mathematical proof verification and technical alignment • Implemented systematic evaluation methodologies for model truthfulness • Scaled collaborative label creation with expert contributors • Ensured high accuracy in complex reasoning tasks