AI Chatbot Response Evaluation and Ranking (RLHF)
I evaluated and ranked multiple responses generated by a large language model based on criteria such as accuracy, consistency, and helpfulness. I also crafted diverse prompts to test and train the model's capabilities in various scenarios. This task involved providing human feedback to directly improve the model's conversational abilities.