Lead AI Trainer / Annotator (Lynx Project)
Designed, developed, and validated realistic multi-turn agentic workflows for AI agent evaluation in simulated environments. Authored and iteratively refined gold-standard agent solution trajectories using detailed system hints and correction cycles. Conducted comprehensive verification testing by running multiple LLM models through complex pipelines to enforce model performance constraints and differentiate solution capabilities. • Built interdependent, realistic databases exceeding 800 objects, incorporating noise and edge-case scenarios. • Wrote verifiable developer instructions and behavioral rubrics to guide AI agents, including Trace, DB, and Final Response constraint types. • Performed rigorous verification and validation using multi-model, multi-run testing protocols to maintain difficulty and dataset integrity. • Enforced behavioral and task-specific rubric standards to prevent shortcut learning and brute-force retrieval.