AI Training & Quality Rater
Reviewed and evaluated 10,000+ LLM-generated outputs across text, speech-to-text transcripts, and video captions. Specific labeling tasks included content rating, transcript and caption validation, quality assessment, safety and bias checks, reasoning and chain-of-thought evaluation, and relevance judgments. Managed a high-volume workflow while adhering to strict rubric-based scoring and quality standards, delivering structured feedback to improve model performance using Surge AI, Alectio, Scale Loop, and Outlier’s evaluation interface. Ensured accuracy, consistency, and compliance across multimodal datasets.