Data Validator for AI/LLM Training Sets
Evaluated and validated large text datasets for accuracy, consistency, and completeness, directly supporting high-quality AI output assessments. Applied systematic anomaly detection and established data quality benchmarks tailored to AI/LLM data evaluation workflows. Liaised with stakeholders to align data quality criteria with annotation standards in multilingual settings. • Reviewed diverse text data for inconsistency and error detection • Used SQL and Excel to support manual and semi-automated QA • Ensured annotated/validated datasets were suitable for AI/LLM use • Collaborated to define scalable data validation protocols