For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Khushi Jain

Khushi Jain

Versatile image and video annotator with 2 years of experience

India flagNew Delhi, India
$10.00/hrIntermediateAppenClickworkerScale AI

Key Skills

Software

AppenAppen
ClickworkerClickworker
Scale AIScale AI

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code ProgrammingComputer Code Programming
ImageImage
VideoVideo

Top Task Types

Computer Programming Coding
Data Collection
Text Generation
Text Summarization
Translation Localization

Freelancer Overview

I have hands-on experience working with leading AI data labeling and training platforms such as Appen and Outlier, where I contributed to diverse projects involving data annotation, prompt engineering, and quality assurance. My roles have required me to accurately label and categorize large datasets, develop and test prompts for AI models, and ensure data integrity for machine learning workflows. I am proficient in using various annotation tools and have applied programming skills to automate repetitive tasks, streamline labeling processes, and validate training data for consistency and accuracy. In addition to my technical expertise, I have worked on projects focused on AI safety and ethical prompt engineering, including tackling Universal Jailbreak challenges. My attention to detail, ability to follow complex guidelines, and commitment to delivering high-quality annotated data set me apart in the field of AI training data.

IntermediateHindiEnglishSpanish

Labeling Experience

Scale AI

Bilingual Prompt-Based Data Labeling for AI Model Training

Scale AITextEntity Ner ClassificationQuestion Answering
I worked on a large-scale data labeling project focused on generating and annotating English and Hindi prompts and responses for training and evaluating large language models. Tasks included creating diverse prompt-response pairs, classifying intent and sentiment, and translating content between English and Hindi to ensure high-quality bilingual datasets. I also participated in prompt evaluation and red teaming for AI safety, adhering to strict quality guidelines and accuracy benchmarks. The project involved labeling over 20,000 text samples and required both linguistic expertise and programming skills for data validation and automation.

I worked on a large-scale data labeling project focused on generating and annotating English and Hindi prompts and responses for training and evaluating large language models. Tasks included creating diverse prompt-response pairs, classifying intent and sentiment, and translating content between English and Hindi to ensure high-quality bilingual datasets. I also participated in prompt evaluation and red teaming for AI safety, adhering to strict quality guidelines and accuracy benchmarks. The project involved labeling over 20,000 text samples and required both linguistic expertise and programming skills for data validation and automation.

2025 - 2025
Appen

Code Annotation and Function Classification for AI Model Training

AppenComputer Code ProgrammingEntity Ner ClassificationEvaluation Rating
I contributed to a large-scale data labeling project focused on annotating and classifying programming code snippets for LLM training and evaluation. My responsibilities included identifying and labeling entities in code (such as functions, variables, and classes), classifying code by language and function, and generating prompt-response pairs for supervised fine-tuning (SFT). I also evaluated AI-generated code outputs for correctness and style, ensuring high-quality datasets for model improvement. The project involved over 10,000 code samples across multiple programming languages and required strict adherence to annotation guidelines and quality assurance protocols.

I contributed to a large-scale data labeling project focused on annotating and classifying programming code snippets for LLM training and evaluation. My responsibilities included identifying and labeling entities in code (such as functions, variables, and classes), classifying code by language and function, and generating prompt-response pairs for supervised fine-tuning (SFT). I also evaluated AI-generated code outputs for correctness and style, ensuring high-quality datasets for model improvement. The project involved over 10,000 code samples across multiple programming languages and required strict adherence to annotation guidelines and quality assurance protocols.

2024 - 2025

Education

C

Cardiff University

Bachelor of Science, Computer Science

Bachelor of Science
2021 - 2024
E

Excelsior American School

International Baccalaureate Diploma, International Baccalaureate

International Baccalaureate Diploma
2018 - 2020

Work History

C

Chegg

Computer Science Subject Matter Expert

N/A
2025 - 2025
O

Outlier

AI Data Annotator and Trainer

New Delhi
2023 - 2025