AI chatbot response evaluation and RLHF Annotation
Contributed to training and improving a large language model by evaluating AI-generated responses for accuracy, safety, coherence, and relevance. Performed RLHF (Reinforcement Learning from Human Feedback) tasks by ranking multiple model outputs and providing structured reasoning for preference selection. Completed over 5,000 prompt-response evaluations, including safety red-teaming tasks to identify hallucinations, bias, and harmful content. Followed strict annotation guidelines with a 95%+ quality score and passed weekly calibration reviews