ML Data Workflow and Curation Intern
During my internship at Exoclass, I contributed to dataset preprocessing and quality-focused machine learning development, including data-centric workflows for validation, augmentation, and feature processing. I participated in gathering, cleansing, and preparing text data to improve model generalization and reduce overfitting in multi-modal AI pipelines. My tasks strengthened dataset reliability for both training and evaluation purposes. • Orchestrated data collection and preprocessing for multi-modal transformer tasks • Applied data validation and augmentation processes • Enhanced feature engineering for improved downstream ML tasks • Optimized datasets to improve training and evaluation quality