LLM Response Evaluator (ChatGPT, Emochi)
I evaluated and analyzed responses from large language models (LLMs) such as ChatGPT and Emochi to identify logical inconsistencies, conversational loops, and safety filter bypasses. My work focused on improving the overall quality and safety of AI-generated content by providing detailed feedback and creative prompting. These efforts enhanced the effectiveness and safety of LLM deployments in real-world scenarios. • Conducted rigorous analysis and evaluation of AI model outputs • Assessed context drift and logical consistency in responses • Tested safety filters and identified potential bypasses • Provided creative prompts and actionable feedback to improve model performance