Data Scientist – Intelligent Document Processing/Entity Extraction
Led the development of a document processing system for automated extraction of structured information from large-scale document datasets. Designed and implemented pipelines for document classification, key-value extraction, and rule-based data transformation. Developed LLM-powered modules to extract entities and attributes from unstructured documents. • Performed large-scale document ingestion and classification using custom models. • Built information extraction modules utilizing LLMs for entity extraction from documents. • Generated structured outputs in JSON and Excel to facilitate downstream analytics. • Developed automation workflows for document labeling, processing, and validation.