AI Quality Analyst/LLM Evaluator (Freelance, Outlier)
As an AI Quality Analyst and LLM Evaluator (Freelance) at Outlier, I evaluated outputs generated by large language models and generative AI systems. I assessed the relevance, accuracy, and safety of AI-generated text and Python code to ensure compliance with business and user requirements. I utilized structured evaluation frameworks such as RLHF to support continuous improvement in AI performance. • Evaluated LLM and generative AI system outputs for accuracy, safety, and instruction adherence. • Designed and refined prompts to identify gaps and inconsistencies in AI responses. • Performed prompt testing and optimization to improve AI response quality. • Assessed AI-generated Python code for logic and quality risks.