AI/ML Data Pipeline & Feature Engineering — Financial Services
Designed and implemented data validation, transformation, and feature engineering pipelines across three roles in financial services and AI automation. Tasks included categorizing and tagging structured financial datasets for ML model training, defining ground-truth labeling criteria for a CNN-based currency classification system (real vs. counterfeit), and applying K-Means clustering with iterative label refinement for customer segmentation. Built quality assurance layers ensuring labeled data consistency and accuracy across pipelines processing up to 2M records/day. Worked with both structured tabular data and unstructured text, applying tokenization and embeddings aligned to NLP annotation standards.