Prompt Evaluation & Response Ranking (Freelance)
Conducted prompt evaluation and response ranking for large language models as a freelance contributor. Assessed the alignment, correctness, and structure of LLM outputs, ranking them for overall helpfulness. Participated in structured feedback and annotation loops to optimize model behavior. • Ranked multiple AI model completions for performance benchmarking. • Reviewed text outputs to verify factual accuracy and intent match. • Joined iterative annotation and feedback cycles for AI refinement. • Leveraged web research and structured evaluation criteria.