AI Training & LLM Evaluation Specialist — Medical Domain
As an AI Training & LLM Evaluation Specialist in the medical domain, I evaluated over 2,000 AI-generated medical responses for factual accuracy, logical consistency, clinical safety, and policy compliance. I annotated over 50,000 structured data points and applied RLHF frameworks to support model performance. My work included prompt engineering, response rewriting, and custom rubric development for content quality assurance. • Conducted expert evaluation of model outputs for hallucinations and unsafe recommendations. • Annotated large-scale structured medical data to enhance LLM datasets. • Leveraged RLHF and evaluation rubrics for continuous model improvement. • Supported model fine-tuning and safety review workflows across pharmacology, anatomy, virology, and public health domains.