AI Evaluation Specialist — Fellowship | Handshake AI (Multimango Platform)
As an AI Evaluation Specialist at Handshake AI (Multimango Platform), I performed comparative preference ranking and rubric-based model evaluation across text, image, audio, and video modalities. I assessed the quality, instruction following, and coherence of AI responses, as well as the alignment of text-to-image and text-to-video generations. I utilized structured rubrics and ELO/H2H frameworks to ensure consistent and accurate judgment across diverse tasks. • Performed side-by-side comparative evaluation of AI model outputs. • Executed preference-ranking for conversational AI and multi-modal tasks. • Applied ELO-based and H2H comparative frameworks, including Omni Elo. • Handled ambiguous multi-modal edge cases and escalated them for review.