Healthcare LLM Training Data Annotator
Curated and annotated a 520 k-document corpus to build a clinical-grade LLM for a telehealth startup. Designed exhaustive NER schemas, built Prodigy recipes for rapid annotation, and used double-masked review + automated spaCy validation to ensure precision/recall at >0.97. Converted physician notes to de-identified Q&A pairs and brief patient-friendly summaries, allowing model deployment to be sped up by three months. The improvement process cut the turnaround in labelling by 60 % without lowering overall accuracy to 98 %+ and earning a follow-on contract worth $ 85k