AI Data Specialist — RLHF & Response Evaluation (Meta AI via Outlier / Multimango)
As an AI Data Specialist for RLHF and response evaluation, I conducted large-scale LLM output comparisons and preference rankings. My work focused on structured rubric-based assessments emphasizing helpfulness, accuracy, harmlessness, and instruction-following. I consistently maintained annotation quality over extensive daily sessions. • Performed comparative evaluation of LLM outputs using structured rubrics. • Conducted preference ranking for reinforcement learning from human feedback (RLHF). • Evaluated model-generated content employing multi-dimensional quality frameworks. • Maintained high annotation standards across 8–10 hour daily workloads.