AI Model Evaluator / AI Data Reviewer (Freelance / Contract)
As an AI Model Evaluator and AI Data Reviewer, I assessed and evaluated AI-generated text outputs, focusing on factual accuracy, reasoning, and alignment with instructions. I utilized structured rubric-based scoring to rate the logical coherence, tone, and overall compliance of large language model (LLM) responses. I documented recurring patterns of model errors and provided benchmarking reports for performance analysis. • Evaluated prompt-response pairs and multi-step reasoning tasks for LLMs. • Identified hallucinations, contradictions, and ambiguous outputs. • Applied structured rubrics for consistent rating across diverse prompts. • Delivered actionable insights to improve model performance.