AI Data Labelling and Model Evaluation Specialist
I have worked on AI data labeling and model evaluation projects involving text classification, reasoning validation, response ranking, and annotation quality review. The scope of these projects focused on improving model accuracy, instruction following, factual consistency, and response safety across a wide range of prompts. My specific labeling tasks included intent classification, sentiment and topic tagging, ranking multiple model responses, identifying hallucinations, verifying factual correctness and applying detailed rubrics to assess clarity, relevance and completeness. I also handled edge case reviews, ambiguity resolution and escalation of unclear annotation guidelines to maintain consistency. The project sizes ranged from several thousand to tens of thousands of data points, often completed in collaboration with distributed teams under strict deadlines. Quality measures adhered to included high inter annotator agreement, calibration rounds, gold-standard checks, blind review sampling, and continuous QA feedback loops. I consistently followed annotation guidelines closely, maintained decision consistency and prioritized precision to ensure reliable training data for downstream model improvement.