Multilingual AI Data Labeling & Evaluation Project
Contributed to large-scale multilingual AI training datasets for NLP and generative AI evaluation. Performed fine-grained text classification, intent tagging, and translation quality assessment (accuracy, fluency, adequacy) for Korean–English–Spanish corpora. Designed evaluation rubrics, scored model outputs, and verified data consistency across engines. Used structured Excel templates to log and validate annotations, ensuring inter-annotator agreement and guideline compliance. Supported model alignment through prompt-response scoring, contextual judgment, and qualitative feedback summaries.