For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
C

Chaitu Talathi

Pod Lead (SR-Edu-ICE – Implicit Code Execution, Google Gemini AIS)

INDIA flag
Pune, India
$40.00/hrExpert

Key Skills

Software

No software listed

Top Subject Matter

STEM Education Visuals via Code Generation
Large Language Model (LLM) Coding Dataset (SWE Evaluation)
AI Safety and Computer-Use Modeling

Top Data Types

ImageImage
TextText

Top Task Types

RLHF

Freelancer Overview

Pod Lead (SR-Edu-ICE – Implicit Code Execution, Google Gemini AIS). Brings 6+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Engineering, Mumbai University (2021). AI-training focus includes data types such as Image, Computer Code, and Programming and labeling workflows including Computer Programming, Coding, and Evaluation.

ExpertEnglish

Labeling Experience

Pod Lead (SR-Edu-ICE – Implicit Code Execution, Google Gemini AIS)

Image
Led efforts to enhance the model’s ability to generate academic visuals via code-generated imagery for education. Built ground-truth reference code and established a code-quality evaluation rubric for model-generated content. Coordinated reviews and collaborated with STEM stakeholders for quality alignment.• Screened prompts and images for suitability as codable tasks early in the process • Verified and created reproducible code scripts to match expected diagrams and visuals • Assessed model-generated code for correctness, efficiency, and reproducibility using a standardized rubric • Directed pod reviews and quality control checkpoints across workflow stages

Led efforts to enhance the model’s ability to generate academic visuals via code-generated imagery for education. Built ground-truth reference code and established a code-quality evaluation rubric for model-generated content. Coordinated reviews and collaborated with STEM stakeholders for quality alignment.• Screened prompts and images for suitability as codable tasks early in the process • Verified and created reproducible code scripts to match expected diagrams and visuals • Assessed model-generated code for correctness, efficiency, and reproducibility using a standardized rubric • Directed pod reviews and quality control checkpoints across workflow stages

2025 - Present

Pod Lead (Google SWE Projects, SWE Bench Phase 2, SWEAP Simplified & SWEAP Salvage)

Curated and validated datasets for LLM training and evaluation, focusing on reliability and issue–test pairing for SWE pipelines. Managed a pod of annotators and enforced data standards while conducting quality audits and collaborating for SWE dataset creation guidelines. Enhanced dataset quality by applying RLHF principles and salvage strategies for defective samples.• Validated and refined over 500 GitHub issue–test samples for LLMs • Analyzed, refactored, and modified Python test cases for robustness • Established and enforced quality standards and audit procedures for annotators • Supported the development and framework of Google SWE dataset creation

Curated and validated datasets for LLM training and evaluation, focusing on reliability and issue–test pairing for SWE pipelines. Managed a pod of annotators and enforced data standards while conducting quality audits and collaborating for SWE dataset creation guidelines. Enhanced dataset quality by applying RLHF principles and salvage strategies for defective samples.• Validated and refined over 500 GitHub issue–test samples for LLMs • Analyzed, refactored, and modified Python test cases for robustness • Established and enforced quality standards and audit procedures for annotators • Supported the development and framework of Google SWE dataset creation

2025 - 2025

Team Lead (Anthropic – Computer Use Model v2)

TextRLHF
Oversaw the creation of large-scale RLHF and preference datasets for computer-use AI models, prioritizing AI safety. Designed annotation workflows for cursor, screen, and keyboard actions, emphasizing pixel-level accuracy and dataset reliability. Led a team for coordinated quality reviews and implementation of AI safety checks to strengthen model safety measures.• Directed a multi-level team across thousands of annotation tasks • Managed structured annotation and workflow for human-computer interactions • Applied RLHF methodology to improve AI helpfulness and reduce hallucinations • Coordinated implementation of prompt-injection and safety review systems

Oversaw the creation of large-scale RLHF and preference datasets for computer-use AI models, prioritizing AI safety. Designed annotation workflows for cursor, screen, and keyboard actions, emphasizing pixel-level accuracy and dataset reliability. Led a team for coordinated quality reviews and implementation of AI safety checks to strengthen model safety measures.• Directed a multi-level team across thousands of annotation tasks • Managed structured annotation and workflow for human-computer interactions • Applied RLHF methodology to improve AI helpfulness and reduce hallucinations • Coordinated implementation of prompt-injection and safety review systems

2024 - 2025

RLHF Trainer (xAI — RLHF Trainer: Python & Data Science)

TextRLHF
Developed RLHF datasets for data science applications by designing prompts and evaluating model responses for reasoning, accuracy, and clarity. Authored ideal responses, validated model code, and ensured consistency with English and Markdown standards. Curated diverse datasets with controlled complexity and contributed to regular quality reviews for continual dataset improvement.• Created and assessed 100+ complex prompts and 200+ model responses • Validated model-generated Python code for correctness and execution • Authored reference responses to guide fine-tuning and standardization • Led quality review sessions to improve prompt and evaluation quality

Developed RLHF datasets for data science applications by designing prompts and evaluating model responses for reasoning, accuracy, and clarity. Authored ideal responses, validated model code, and ensured consistency with English and Markdown standards. Curated diverse datasets with controlled complexity and contributed to regular quality reviews for continual dataset improvement.• Created and assessed 100+ complex prompts and 200+ model responses • Validated model-generated Python code for correctness and execution • Authored reference responses to guide fine-tuning and standardization • Led quality review sessions to improve prompt and evaluation quality

2024 - 2024

Education

M

Mumbai University

Bachelor of Engineering, Computer Science

Bachelor of Engineering
2016 - 2021

Work History

F

Feynn Labs

Data Science Intern

Mumbai
2024 - 2024
N

N N Sales Corporation

Business Owner and Partner

Mumbai
2020 - 2024