Data Labeler at Data Annotation
I specialized in bilingual evaluation and created model training data for CS/STEM tasks through data labeling at Data Annotation and Prolific. My work included assessing and evaluating model capabilities as well as designing rubrics for large language model (LLM) evaluation. I contributed to multilingual model assessment and prompt/response writing for supervised fine-tuning of LLMs. • Performed cross-lingual model evaluation and benchmarking. • Created structured training datasets in technical (CS/STEM) domains. • Designed annotation rubrics for systematic comparison. • Assessed model outputs for accuracy, fluency, and relevance.