Lead AI Engineer & Technical Co-Founder – NLP Data Labeling and Entity Extraction
I designed and managed NLP pipelines focused on data labeling of unstructured crime report text for entity recognition and metadata tagging. My responsibility included custom text chunking strategies and entity classification to support semantic document retrieval in live production systems. I ensured labeled datasets powered accurate crime classification via large language models and vector search systems. • Built workflows for named entity recognition and sensitive data (PII) detection in police and crime reports. • Generated metadata tags and semantic segments for crime-related document bodies. • Executed custom chunking for sentence, paragraph, and sliding window level annotation. • Provided labeled text sets enabling classification, retrieval, and LLM prompt development.