RLHF / LLM evaluation
Worked on RLHF-based training and evaluation of large language models, focusing on assessing and improving model outputs for accuracy, safety, and instruction adherence. Tasks included ranking multiple responses, identifying hallucinations, evaluating factual correctness, tone, and policy compliance, and providing structured feedback to guide model optimisation. Applied detailed annotation guidelines and quality standards to ensure consistency across evaluations, contributing to the refinement of AI behaviour and user-aligned responses. Experience includes handling edge cases, ambiguous prompts, and nuanced judgment tasks requiring critical thinking and contextual understanding.