Japanese AI Response Evaluation & Guideline Development
(The same above) In this project, I worked on evaluating AI-generated responses in Japanese, ensuring linguistic accuracy, consistency, and alignment with predefined quality standards. My tasks included rating responses based on RLHF (Reinforcement Learning from Human Feedback) criteria, refining evaluation guidelines, and developing selection standards for assessing AI-generated text. Additionally, I reviewed and improved criteria created by other evaluators to enhance the reliability of the assessment process. This role required a deep understanding of Japanese language nuances, critical thinking, and the ability to systematically analyze AI outputs to provide high-quality training data for LLM (Large Language Model) development.