LLM Response Evaluation & Ranking - RWS
Evaluated AI-generated outputs across text, image, and video-related tasks for clarity, accuracy, tone, and instruction adherence. Performed comparative ranking, structured annotation, and quality scoring using detailed guidelines. Maintained high consistency across large task volumes while meeting strict quality thresholds and turnaround expectations. Provided actionable feedback to improve model performance and alignment.