Remote AI Evaluation Contributor
The contributor evaluated AI-generated responses for factual accuracy, logical consistency, and adherence to instructions. Work involved structured scoring, detailed annotation, and ranking multiple responses for clarity and correctness. Quality control and feedback provision helped improve AI output reliability and performance. • Reviewed AI outputs against prompts and scoring rubrics • Identified contradictions, unsupported claims, and logical errors • Provided structured written feedback and clear annotations • Tested prompt variations and follow-up question consistency