Chinese Mandarin evaluate AI responses
This project involved evaluating AI-generated responses to Chinese Mandarin prompts across multiple domains, including general knowledge, business, and technical topics. My role focused on assessing the fluency, accuracy, contextual relevance, and cultural appropriateness of LLM outputs, with a particular emphasis on bilingual consistency and business logic alignment. The project also required familiarity with prompt engineering, response classification, and error pattern identification, contributing to the continuous improvement of AI model performance in Chinese-language contexts.