Data Preparation and Model Training for OCR and Detection Models
Built and maintained OCR pipelines for extracting structured data from scanned documents, focusing on improving data reliability and accuracy. Assisted in training and fine-tuning detection models by preparing datasets and conducting thorough evaluation and error analysis. Automated and optimized ingestion workflows using advanced preprocessing and AI model integration. • Utilized EasyOCR, Tesseract, and OpenCV for document processing. • Conducted dataset preparation and error analysis to refine model performance. • Participated in the training and evaluation of object detection pipelines using PyTorch/TensorFlow and YOLO with Roboflow. • Enhanced data workflows and model reliability through automation and collaborative coding practices.