PhD Researcher – Clinical Information Extraction Pipeline
Designed and implemented an LLM-based pipeline for extracting and structuring clinical information from unstructured pathology reports. Developed datasets by annotating clinical entities within medical documents to support analytics and downstream machine learning. Performed iterative label refinement to optimize information extraction for healthcare applications. • Constructed NER-labeled corpora from medical reports. • Applied explainable AI for annotation verification. • Enhanced data preprocessing for improved label consistency. • Benchmarked extraction pipelines against clinical gold standards.