AI Evaluator – LLM and AI Output Evaluation
I evaluated and rated AI-generated responses across multiple platforms, focusing on model accuracy, helpfulness, and alignment. My contributions included prompt writing and prompt testing to assess and improve model outputs for technical and general domains. Evaluations were conducted within defined rubrics and structured feedback was provided to further model development. • Rated model outputs for quality and alignment. • Wrote and tested prompts to identify weaknesses in AI models. • Provided technical feedback for engineering and cloud domains. • Supported RLHF and prompt-based evaluation workflows.