AI Trainer for Anthropic (Contract)
As an AI Trainer for Anthropic, I evaluated AI-generated code and long-form responses for correctness and safety. I designed prompts and scenarios to rigorously test model behavior and reasoning. I produced detailed annotation and evaluation reports to enhance the quality of data for large language models in software engineering workflows. • Reviewed and compared model outputs using structured quality frameworks • Assessed model responses for correctness, safety, and adherence to instructions • Identified weaknesses in model verification and code quality • Improved training data via thorough evaluation and reporting