AI chat review and scoring
Reviewed and evaluated AI-generated chat responses to improve language quality, accuracy, and helpfulness. Performed RLHF-based annotation by comparing multiple model outputs, scoring responses based on relevance, coherence, factual correctness, and safety guidelines. Identified errors, biases, and hallucinations while providing structured feedback to enhance model performance. Followed annotation guidelines and maintained consistency across tasks to ensure high-quality training data for conversational AI systems.