Prompt Evaluator
As a Prompt Evaluator, I design and evaluate structured prompt–response datasets for advanced AI language models. My responsibilities include creating complex prompts, authoring reference answers, and developing evaluation rubrics. I analyze AI model outputs for accuracy and adherence to detailed criteria. • Designed high-quality prompts with multiple constraints • Authored golden answers for benchmarking • Developed and implemented evaluation rubrics • Participated in dataset QA and benchmarking workflows