AI Data Specialist (Meta)
Evaluated large language model (LLM) conversational responses for factual accuracy, reasoning validity, clarity, and tone. Performed large-scale pairwise comparisons, ranking tasks, and annotated strengths and weaknesses using standardized rubrics. Identified reasoning gaps and communication failures, ensuring high inter-annotator agreement following structured taxonomies. • Conducted detailed fact-checking using trusted public sources. • Produced consistent and reproducible evaluation artifacts for model quality improvement. • Supported reinforcement learning and model optimization workflows. • Maintained rigorous adherence to evaluation guidelines and benchmarks.