Multimodal Live Interaction Assessment & RLHF Evaluation for Conversational AI
As a senior contributor and reviewer, I participated in the evaluation of advanced conversational AI systems’ new live interaction modes. This project involved designing and executing comprehensive test scenarios to assess live conversation capabilities, memory functions, and real-time user interactions across multiple modalities, including audio and video data. My responsibilities included: Multimodal annotation and quality assurance (text, image, video, live interaction) Creating test scenarios, conducting live conversations, providing structured feedback to improve model performance and evaluation guidelines, applying detailed RLHF-based rating scales for model output evaluation, and reviewing, rating, and optimizing other contributors' tasks. The project contributed directly to the refinement and quality assurance of next-generation AI systems for leading industry clients.