AI Model Evaluator (Contract)
As an AI Model Evaluator, I assessed and ranked AI-generated responses spanning text, image, and voice. I identified model failure modes and conducted error analysis to improve the quality of model outputs. I performed multimodal annotation tasks and real-time conversational AI testing under high efficiency standards. • Evaluated outputs for accuracy, logical consistency, and instruction adherence. • Conducted error analysis for hallucinations and misalignment. • Executed tasks including masking, referring expressions, and multimodal evaluation. • Assessed naturalness and adaptability in conversational AI systems.