AI Model Evaluation/Data Labeling (Rubric Scoring, VLM/LLM) — Contributor
Contributed to AI model evaluation and data labeling using rubric-based scoring protocols for multimodal (image and text) question answering tasks. Applied strict SOP-driven procedures, including mandatory Indonesian language checks and answer-key validations, to assess model outputs objectively and subjectively. Produced audit artifacts and concise English commentary to ensure scoring consistency, traceability, and reporting across multiple educational domains. • Scored outputs based on tiered criteria such as full or partial correctness. • Validated responses against curated reference keys, identifying mismatches and required labels. • Enforced rubric rules for both objective and subjective grading with bilingual workflow. • Worked extensively with education-related content: math, science, physics, PJOK, and Indonesian history.