For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
S
Saurabh Gaur

Saurabh Gaur

AI Evaluation Specialist — Outlier.ai, OpenClaw Atlas Project

India flagRudrapur, India
$33.00/hrIntermediateOther

Key Skills

Software

Other

Top Subject Matter

AI Model Evaluation
Logistics/Healthcare Agents
Image Data Labeling and Prompt Engineering

Top Data Types

TextText
ImageImage
DocumentDocument

Top Task Types

ClassificationClassification

Freelancer Overview

AI Evaluation Specialist — Outlier.ai, OpenClaw Atlas Project. Brings 4+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal, Proprietary Tooling, and Other. Education includes Bachelor of Technology, AKTU, Lucknow (2025) and Diploma in Mechanical Engineering, BTEUP, Uttar Pradesh (2019). AI-training focus includes data types such as Text and Image and labeling workflows including Evaluation, Rating, and Classification.

IntermediateEnglishHindi

Labeling Experience

AI Evaluation Specialist — Outlier.ai, OpenClaw Atlas Project

Text
As an AI Evaluation Specialist at Outlier.ai, I designed, executed, and documented complex multi-model AI evaluations. My efforts included structured dataset creation, scenario development, and benchmarking model responses for fairness and performance. I developed failure taxonomies and managed evaluation runs across different universes and domains. • Crafted and utilized synthetic text datasets to test model reasoning and assess time-domain consistency. • Benchmarked single-turn model responses using identical prompts across multiple LLMs. • Developed taxonomies for categorizing and flagging failure modes such as privacy and defamation-risk. • Oversaw evaluation of agent-based tasks in both logistics/custody and healthcare subject domains.

As an AI Evaluation Specialist at Outlier.ai, I designed, executed, and documented complex multi-model AI evaluations. My efforts included structured dataset creation, scenario development, and benchmarking model responses for fairness and performance. I developed failure taxonomies and managed evaluation runs across different universes and domains. • Crafted and utilized synthetic text datasets to test model reasoning and assess time-domain consistency. • Benchmarked single-turn model responses using identical prompts across multiple LLMs. • Developed taxonomies for categorizing and flagging failure modes such as privacy and defamation-risk. • Oversaw evaluation of agent-based tasks in both logistics/custody and healthcare subject domains.

2026 - Present

AI Image Dataset & Prompt Engineering Pipeline — Personal Project

OtherImageClassification
I independently constructed an annotated image dataset using Stable Diffusion and Midjourney with structured metadata and version tracking. The pipeline included prompt engineering, image generation, and thorough quality evaluation for dataset consistency. I developed rubric-based evaluation and taxonomy classification to enhance multi-style data utility for ML use. • Generated and labeled over 200 images with prompt-string metadata. • Created and tracked quality criteria for dataset uniformity. • Applied classification and annotation relevant for model training pipelines. • Maintained reproducibility via versioning and rubric-driven review.

I independently constructed an annotated image dataset using Stable Diffusion and Midjourney with structured metadata and version tracking. The pipeline included prompt engineering, image generation, and thorough quality evaluation for dataset consistency. I developed rubric-based evaluation and taxonomy classification to enhance multi-style data utility for ML use. • Generated and labeled over 200 images with prompt-string metadata. • Created and tracked quality criteria for dataset uniformity. • Applied classification and annotation relevant for model training pipelines. • Maintained reproducibility via versioning and rubric-driven review.

2024 - Present

Education

A

AKTU, Lucknow

Bachelor of Technology, Computer Science and Engineering

Bachelor of Technology
2020 - 2025
B

BTEUP, Uttar Pradesh

Diploma in Mechanical Engineering, Mechanical Engineering

Diploma in Mechanical Engineering
2016 - 2019

Work History

T

Tata Advanced Systems Ltd.

Engineering Technical Associate

Rudrapur
2022 - 2023
L

Lancer Industries

IT Systems & Software Support Technician

Rudrapur
2021 - 2022