AI Data Annotation & Labeling Specialist (RLHF)
Evaluated AI-generated text responses for correctness, tone, safety, and helpfulness as part of RLHF and model alignment workflows. Compared AI and human preferences to support high-quality reinforcement learning datasets. Used consensus, inter-annotator agreement, and multi-pass review to ensure accurate evaluation. • Reviewed output for harmful content and tone deviations • Ranked and rated model responses for machine learning refinement • Flagged edge cases and provided resolution feedback • Maintained high consistency for RLHF evaluations