Text, Image
Evaluated AI-generated content for quality, safety, policy adherence, and usability. Labeled outputs based on predefined rubrics covering correctness, tone, relevance, and potential risk. Flagged unsafe, misleading, or low-quality outputs and provided structured feedback to improve downstream model behavior. Ensured consistency and reliability across large evaluation batches.