For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
S
Suraj Mandal

Suraj Mandal

AI Training Specialist – Agentic Coding

India flagPune, India
$40.00/hrIntermediateLabelboxMindriftRemotasks

Key Skills

Software

LabelboxLabelbox
MindriftMindrift
RemotasksRemotasks
Scale AIScale AI
TolokaToloka
Data Annotation TechData Annotation Tech
AppenAppen

Top Subject Matter

Agentic coding models
autonomous software development
Generative AI for computer programming

Top Data Types

TextText
Computer Code ProgrammingComputer Code Programming
ImageImage

Top Task Types

RLHFRLHF
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)
Computer Programming/CodingComputer Programming/Coding
Function CallingFunction Calling
Red TeamingRed Teaming
Fine-tuningFine-tuning
Evaluation/RatingEvaluation/Rating
Data CollectionData Collection

Freelancer Overview

AI Training Specialist – Agentic Coding. Brings 5+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Engineering, Faculty of Engineering, Jadavpur University (2020). AI-training focus includes data types such as Computer Code and Programming and labeling workflows including RLHF and Prompt + Response Writing (SFT).

IntermediateEnglishBengaliHindi

Labeling Experience

AI Training Specialist – Agentic Coding

RLHF
As an AI Training Specialist at Alignerr, I evaluated agentic coding models on multi-step coding tasks and bug fixing using real open-source repositories. I provided detailed RLHF feedback and authored evaluations covering repository navigation, dependency resolution, test execution, and pull-request workflows. I created gold-standard solutions, adversarial prompts, and rubric-based annotations to surface edge cases and model weaknesses. • Evaluated Claude and Cline-based models for code correctness, instruction following, and robustness. • Authored multi-turn agentic evaluations and annotated findings as structured feedback. • Wrote reference solutions for coding tasks to aid in reward model training. • Stress-tested agents under ambiguous specifications, identifying unsafe outputs and reasoning failures.

As an AI Training Specialist at Alignerr, I evaluated agentic coding models on multi-step coding tasks and bug fixing using real open-source repositories. I provided detailed RLHF feedback and authored evaluations covering repository navigation, dependency resolution, test execution, and pull-request workflows. I created gold-standard solutions, adversarial prompts, and rubric-based annotations to surface edge cases and model weaknesses. • Evaluated Claude and Cline-based models for code correctness, instruction following, and robustness. • Authored multi-turn agentic evaluations and annotated findings as structured feedback. • Wrote reference solutions for coding tasks to aid in reward model training. • Stress-tested agents under ambiguous specifications, identifying unsafe outputs and reasoning failures.

2025 - Present

AI Training Specialist – Coding & Generative AI

Prompt Response Writing SFT
At Outlier, I authored SFT and RLHF training data by crafting complex coding prompts, completions, and ranking responses across diverse programming languages. My work included rating AI responses against nuanced rubrics and designing agentic scenarios to train LLMs on practical code tasks. I also performed detailed error analysis, preference annotations, and authored gold-standard responses for development tasks. • Generated high-quality supervised fine-tuning data for LLM coding tasks. • Rated code outputs using atomic rubrics focused on clarity, correctness, and safety. • Designed multi-turn tool-calling and function invocation challenges in agentic environments. • Benchmarked front-end code generation and performed side-by-side model comparisons.

At Outlier, I authored SFT and RLHF training data by crafting complex coding prompts, completions, and ranking responses across diverse programming languages. My work included rating AI responses against nuanced rubrics and designing agentic scenarios to train LLMs on practical code tasks. I also performed detailed error analysis, preference annotations, and authored gold-standard responses for development tasks. • Generated high-quality supervised fine-tuning data for LLM coding tasks. • Rated code outputs using atomic rubrics focused on clarity, correctness, and safety. • Designed multi-turn tool-calling and function invocation challenges in agentic environments. • Benchmarked front-end code generation and performed side-by-side model comparisons.

2024 - Present

Education

F

Faculty of Engineering, Jadavpur University

Bachelor of Engineering, Electronics and Telecommunication Engineering

Bachelor of Engineering
2016 - 2020

Work History

F

FPL Technologies

Software Engineer

Pune
2025 - Present
W

Wipro

Senior Analyst

Gurugram
2024 - 2025