Data Labeling Specialist & AI Training Contributor
Contributed to Reinforcement Learning from Human Feedback (RLHF) projects by ranking and evaluating AI-generated responses. Applied detailed annotation guidelines to assess helpfulness, accuracy, safety, and adherence to provided instructions. Supported model improvement initiatives through precise and consistent response evaluation.• Ranked large batches of ChatGPT and Claude AI outputs for instructional quality. • Applied safety and policy compliance checks during response judgment. • Used efficacy benchmarks for differentiating high and low-quality completions. • Provided real-time feedback that contributed to model iteration cycles.