AI/ML Data & Content Evaluation (Freelance, Part-time)
Evaluated outputs from AI assistants for accuracy, clarity, and compliance with task specifications. Created prompt-based test cases to probe model reasoning and reveal weaknesses. Selected media files and authored multi-turn dialogues for training data. • Ensured adherence to detailed evaluation rubrics for biology and media tasks. • Identified and documented ambiguous or inconsistent outputs for retraining. • Provided clear, concise reviewer notes for annotation quality control. • Used spreadsheets and internal tools for annotation tracking and reporting.