FullStack Software Engineer for AI Training
Worked on an AI training and evaluation project involving coding agents and multimodal data annotation. The project focused on evaluating how AI models interpret and respond to real-world coding tasks using small to mid-sized codebases. Tasks included designing prompts with varying levels of ambiguity (e.g., underspecified requirements, conflicting constraints) to test model reasoning, accuracy, and failure cases. Performed detailed review and annotation of model outputs, including validating code changes, identifying incorrect assumptions, and ensuring logical correctness. Additionally contributed to data labeling tasks across multiple modalities such as text (response quality and correctness), image (classification and attribute labeling), audio (transcription validation), and video (content categorization). Maintained high-quality annotation standards by following defined guidelines, ensuring consistency, and handling edge cases carefully. The project involved working with structured workflows, reviewing outputs iteratively, and improving prompt design to generate better evaluation data.