Ranking Chatbot Responses for Helpfulness
Compared two AI-generated answers to the same user question and selected which one was more helpful, accurate, and harmless.Labeled 500 prompts across topics like general knowledge, writing help, and basic math. Followed a simple rubric: helpful >vague, accurate> made-up info, safe>toxic.Weekly feedback sessions improved consistency.