DataAnnotation, freelance
I conducted side-by-side evaluations of Large Language Model (LLM) outputs to assess factual accuracy, logical reasoning, tone, and completeness. My responsibilities included identifying the superior response based on specified criteria, then articulating evidence-based rationales for my selections. This work required analytical skills and attention to linguistic detail to ensure reliable AI output assessment. • Compared LLM-generated text outputs and assessed quality. • Evaluated responses for accuracy, reasoning, and appropriateness. • Drafted comprehensive rationales with supporting data points. • Maintained high standards for fair and systematic model comparison.