AI Certificate Parser – Data Extraction and Structuring Contributor
I worked on the AI Certificate Parser project, which involved using OCR and large language models to extract and structure data from certificate documents. My main responsibilities included developing and refining algorithms to improve the accuracy of text extraction and formatting. The goal was to automate the end-to-end process of digitizing certificates for scalable use. • Processed scanned certificates using Python-based OCR technologies. • Designed workflows to convert unstructured document text into structured data. • Applied text generation techniques for data validation and consistency. • Utilized internal/proprietary tooling for data labeling and annotation.