AI Trainer
In this role, I work as an AI Trainer and evaluator focused on benchmarking a developing AI model against leading competitors in the market. The core objective is to assess model performance, accuracy, and reasoning quality through structured Side-by-Side (SxS) comparisons. The tasks primarily involve image recognition workflows using video inputs. I analyze how the model interprets visual content, identifies objects, understands context, and responds to prompts. Each evaluation includes a structured SxS comparison between the target model and competing systems, assessing criteria such as correctness, completeness, consistency, and reasoning clarity. For every task, I interrogate the model with a minimum of three follow-up prompts to test robustness, contextual memory, error correction, and logical consistency. This iterative probing process ensures a deeper evaluation of model behavior beyond surface-level responses.