Clinical Vignette & Prompt Evaluation – AI Output Review/Labeling
Independently reviewed and rated over 1,000 AI-generated clinical explanations during Step 2 CK exam preparation. Focused on identifying subtle errors and unsafe recommendations in LLM-produced answers across multiple specialties. Developed strong intuition for RLHF annotation and model safety evaluation. • Flagged inappropriate drug dosing and contraindication errors consistently. • Critiqued AI medical recommendations for evidence-based accuracy. • Assessed alignment with clinical guidelines in vignette explanations. • Enhanced model safety by careful annotation and review of LLM mistakes.