For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
G
Gaurav Kumar

Gaurav Kumar

Data Annotation Specialist | Outlier-trained AI Labeler

India flagphagwara, India
$15.00/hrIntermediateAppenClickworkerData Annotation Tech

Key Skills

Software

AppenAppen
ClickworkerClickworker
Data Annotation TechData Annotation Tech
LabelboxLabelbox
Label StudioLabel Studio
RoboflowRoboflow
Scale AIScale AI
TelusTelus

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code ProgrammingComputer Code Programming
TextText
VideoVideo

Top Task Types

Action RecognitionAction Recognition
Computer Programming/CodingComputer Programming/Coding
Object DetectionObject Detection
Text SummarizationText Summarization
TrackingTracking

Freelancer Overview

I specialize in high-precision AI data annotation, Python code evaluation, and QA for LLM training. With 5+ years across text, image, video, and code datasets, I handle detailed reasoning assessments, code-review pipelines, tool-use checks, and advanced polygon segmentation on Labelbox. My work spans full evaluation workflows: analyzing reasoning chains, debugging model-generated Python code, writing gold-standard solutions, validating function calls, and performing timestamp-based video event labeling. I routinely catch inconsistencies, clarify ambiguous instructions, and maintain strict SLA-level quality with >95% QA accuracy. I have hands-on experience with Label Studio, including setting up interfaces, managing project configs, and performing multi-annotator QA on tasks like text classification, NER, image bounding boxes, and structured JSON outputs. My video labeling experience includes action recognition, frame-level event tagging, timestamp segmentation, and object tracking across sequences using Labelbox and Label Studio. Key strengths: Python QA; structured data review (JSON/YAML); chain-of-thought verification; tool testing and bug reporting; fine-grained polygon masks; temporal video annotations; use of AI coding tools (Cursor, Claude Code, Windsurf); strong communication and guideline compliance.

IntermediateHindiEnglish

Labeling Experience

Appen

Safety Alignment: Toxicity, Risk Mitigation & Response Correction

AppenImageBounding BoxEntity Ner Classification
Evaluated high-risk prompts and model outputs for safety violations, misinformation, bias, and harmful reasoning. Generated safer alternatives, rewrote risky outputs, and tagged safety categories with high precision. Worked on datasets used for training AI systems to avoid harmful or unethical behaviors.

Evaluated high-risk prompts and model outputs for safety violations, misinformation, bias, and harmful reasoning. Generated safer alternatives, rewrote risky outputs, and tagged safety categories with high precision. Worked on datasets used for training AI systems to avoid harmful or unethical behaviors.

2024 - 2025
Data Annotation Tech

Code Annotation: Error Classification & Reasoning Evaluation

Data Annotation TechComputer Code ProgrammingQuestion AnsweringFine Tuning
Annotated and evaluated code snippets for correctness, reasoning quality, syntax issues, security flaws, and edge-case handling. Reviewed step-by-step reasoning produced by AI models, corrected logical errors, and wrote gold-standard explanations. Categorized code issues, validated expected outputs, and contributed to building datasets for AI coding assistants. Labeled 12,000+ code samples across algorithmic, debugging, and tool-use tasks.

Annotated and evaluated code snippets for correctness, reasoning quality, syntax issues, security flaws, and edge-case handling. Reviewed step-by-step reasoning produced by AI models, corrected logical errors, and wrote gold-standard explanations. Categorized code issues, validated expected outputs, and contributed to building datasets for AI coding assistants. Labeled 12,000+ code samples across algorithmic, debugging, and tool-use tasks.

2024 - 2025
Roboflow

Dataset Preparation & Segmentation Workflow (Roboflow)

RoboflowImageSegmentationClassification
Worked on small image batches to prepare segmentation datasets using Roboflow. Tasks included importing images, validating polygon masks, correcting alignment issues, applying augmentations, and exporting datasets in COCO/YOLO formats for training. Ensured mask consistency, cleaned annotation noise, and maintained structured dataset versions. Also tested segmentation workflows between Roboflow and Labelbox to ensure compatibility.

Worked on small image batches to prepare segmentation datasets using Roboflow. Tasks included importing images, validating polygon masks, correcting alignment issues, applying augmentations, and exporting datasets in COCO/YOLO formats for training. Ensured mask consistency, cleaned annotation noise, and maintained structured dataset versions. Also tested segmentation workflows between Roboflow and Labelbox to ensure compatibility.

2024 - 2024
Labelbox

LLM Reasoning, Evaluation & Chain-of-Thought Annotation

LabelboxTextQuestion AnsweringText Summarization
Worked on a large-scale LLM evaluation initiative focused on refining agent behavior, improving reasoning quality, and identifying hallucinations. Responsibilities included analyzing complex multi-step reasoning outputs, evaluating coherence and correctness, annotating alternative reasoning paths, writing gold-standard solutions, and detecting missing assumptions in agent tasks. Reviewed and improved metadata structures (JSON/YAML) and contributed to prompt rewriting for higher clarity. Maintained a >95% QA accuracy score and helped expand edge-case scenarios for autonomous agent testing. Processed more than 25,000+ text samples across diverse task types.

Worked on a large-scale LLM evaluation initiative focused on refining agent behavior, improving reasoning quality, and identifying hallucinations. Responsibilities included analyzing complex multi-step reasoning outputs, evaluating coherence and correctness, annotating alternative reasoning paths, writing gold-standard solutions, and detecting missing assumptions in agent tasks. Reviewed and improved metadata structures (JSON/YAML) and contributed to prompt rewriting for higher clarity. Maintained a >95% QA accuracy score and helped expand edge-case scenarios for autonomous agent testing. Processed more than 25,000+ text samples across diverse task types.

2024 - 2024
Label Studio

Action Recognition & Temporal Labeling

Label StudioVideoSegmentationClassification
Annotated short and long-form video clips for action recognition, event detection, and temporal segmentation. Performed frame-by-frame tracking, timestamp-level labeling, and multi-object consistency checks across sequences. Flagged ambiguous frames, maintained strict temporal alignment, and complied with high QA thresholds (>95% accuracy). Processed thousands of video samples while documenting edge cases and ensuring reproducible annotation standards.

Annotated short and long-form video clips for action recognition, event detection, and temporal segmentation. Performed frame-by-frame tracking, timestamp-level labeling, and multi-object consistency checks across sequences. Flagged ambiguous frames, maintained strict temporal alignment, and complied with high QA thresholds (>95% accuracy). Processed thousands of video samples while documenting edge cases and ensuring reproducible annotation standards.

2023 - 2024

Education

K

Kendriya Vidhalaya Manesar

12th Grade, Science

12th Grade
2020 - 2021
K

Kendriya Vidhalaya Kankinara

10th Grade, Science

10th Grade
2017 - 2018

Work History

A

Automation Ace

Automation Intern

Patna
2025 - 2025
A

Automation Ace

Automation Intern

Patna
2025 - 2025