Clinical Text Re-Annotation & Entity Alignment (Academic Project)
As part of my Master's thesis on clinical NLP, I experimented with re-annotating portions of the i2b2 2014 medical dataset using MetaMap for automated medical concept extraction. Tasks performed: Ran MetaMap to extract UMLS concepts from clinical narratives Compared extracted entities with existing gold annotations Adjusted entity spans to align with token-level BIO format Cleaned text while preserving character offsets Reviewed boundary mismatches between automated and original annotations Conducted small-scale manual validation to verify alignment accuracy Project Scope: Academic-scale experimentation (subset of clinical narratives) Quality Measures: Manual inspection of entity span mismatches Offset validation to prevent annotation drift Verification of medical concept normalization