AI Prompt Evaluator (Freelance) — Outlier AI
As an AI Prompt Evaluator with Outlier AI, I designed and tested structured prompts for evaluating large language models across creative, legal, and analytical subjects. I scored AI model outputs to assess coherence, factual grounding, and instruction adherence, ensuring unbiased and contextually accurate responses. My work involved applying analytical and domain-specific reasoning to optimize and refine AI-generated content for quality and integrity. • Developed and executed benchmarking scenarios using advanced prompt strategies. • Detected and flagged ambiguous or biased AI outputs using legal and compliance insights. • Collaborated with cross-functional teams to iterate on annotation guidelines and feedback loops. • Used multiple LLM platforms and tracking tools to document evaluation outcomes.