LLM Response Evaluation and RLHF Annotation Project
Assessed on 5,500 AI responses for clarity, accuracy, safety, and relevance. Rated the outputs using rubrics and wrote justifications for each rating. Identified inaccuracies, inconsistencies, and policy violations. Conducted fact-checking using credible sources in academia and industry. Developed and refined pairs of prompts and responses for AI fine-tuning (SFT). Supported the RLHF process by rating responses and selecting the best outputs. Adhered to high standards of quality and accuracy.