Clinical AI Response Evaluation & Medical Reasoning Annotation
This project involved evaluating and annotating AI-generated medical and dental responses to improve clinical reasoning accuracy, safety, and alignment with evidence-based guidelines. Responsibilities included reviewing diagnostic outputs, ranking responses using structured rubrics, identifying hallucinations and unsafe treatment recommendations, and providing corrective feedback for supervised fine-tuning and reinforcement learning from human feedback (RLHF) workflows.