LLM Output Evaluation & Human Feedback (RLHF)
Performed human-in-the-loop evaluation and feedback tasks to improve large language model performance and alignment. Responsibilities included reviewing and comparing multiple model-generated responses, ranking outputs based on accuracy, relevance, reasoning quality, and instruction-following, and providing structured qualitative feedback. Tasks required close adherence to detailed project guidelines, consistent judgment across edge cases, and careful fact-checking. Emphasis was placed on reasoning clarity, real-world correctness, and quality assurance. Work includes some voice over and audio submissions.