AI Evaluation Specialist (LLM Evaluation, RLHF)
I specialized in evaluating large language model outputs using Reinforcement Learning from Human Feedback (RLHF) methodologies. My responsibilities involved ranking and assessing AI-generated text for accuracy, helpfulness, and safety. I ensured strict adherence to technical and logical standards throughout model evaluation cycles. • Provided feedback to improve model response quality. • Applied rigorous logic and prompt engineering techniques to test model consistency. • Focused on refining LLMs for multiple subject domains, including architecture and technology. • Maintained clear records using internal/proprietary tooling.