For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Gurpreet Kaur

Gurpreet Kaur

LLM data trainer with STEM background and Hindi-English expertise

India flagDibrugarh, Assam, India
$20.00/hrIntermediateLabelboxRemotasksScale AI

Key Skills

Software

LabelboxLabelbox
RemotasksRemotasks
Scale AIScale AI

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
ImageImage
TextText

Top Task Types

Audio Recording
Data Collection
Evaluation Rating
Prompt Response Writing SFT
RLHF

Freelancer Overview

I am an experienced AI Data Trainer with a strong background in LLM evaluation, multilingual data labeling (English & Hindi), and STEM-focused dataset creation. Over the past several years, I have worked extensively on tasks involving prompt writing, response analysis, instruction-following evaluation, safety review, and rubric development. My expertise spans NLP, reasoning tasks, multimodal annotation, and reinforcement-style feedback where I refine incorrect model outputs using accurate STEM concepts. I have also contributed to high-quality datasets across text, image, and audio modalities—including transcription, classification, and quality assessment—ensuring clean, consistent, and contextually aligned data for advanced AI systems. With a solid academic foundation in Mathematics and all major STEM subjects (PCMB), I bring strong analytical skills, conceptual clarity, and attention to detail to every project. I excel in crafting challenging prompts to identify model weaknesses, evaluating bilingual outputs for accuracy and naturalness, and producing structured, guideline-aligned annotations. My combined STEM knowledge, linguistic fluency, and hands-on experience with AI model training set me apart as a versatile and reliable data labeling professional.

IntermediateHindiPunjabiEnglish

Labeling Experience

Scale AI

Rubric Development & Quality Review

Scale AITextRLHFPrompt Response Writing SFT
I worked on developing and refining rubrics specifically for evaluating chatbot and LLM behavior. This included designing clear criteria for judging response quality, reasoning accuracy, safety compliance, tone, and instruction-following. I also helped improve annotation guidelines by identifying ambiguous cases, clarifying edge conditions, and reviewing example responses. My work ensured consistent and reliable evaluation of chatbot outputs across large datasets, maintaining high-quality standards for conversational AI training.

I worked on developing and refining rubrics specifically for evaluating chatbot and LLM behavior. This included designing clear criteria for judging response quality, reasoning accuracy, safety compliance, tone, and instruction-following. I also helped improve annotation guidelines by identifying ambiguous cases, clarifying edge conditions, and reviewing example responses. My work ensured consistent and reliable evaluation of chatbot outputs across large datasets, maintaining high-quality standards for conversational AI training.

2025
Scale AI

STEM Data Annotation

Scale AITextRLHFPrompt Response Writing SFT
This project involved creating and validating STEM datasets across Physics, Chemistry, Mathematics, and Biology (PCMB). I reviewed and corrected AI responses, generated concept-based prompts, and identified reasoning errors in domain-specific explanations. The project required strong subject knowledge and bilingual clarity, as many tasks were completed in both English and Hindi. I labeled and refined more than 3,000 STEM tasks, ensuring mathematical accuracy, scientific correctness, and logical consistency. Quality measures included verifying formulas, checking step-by-step reasoning, adhering to domain rubrics, and maintaining cross-language precision.

This project involved creating and validating STEM datasets across Physics, Chemistry, Mathematics, and Biology (PCMB). I reviewed and corrected AI responses, generated concept-based prompts, and identified reasoning errors in domain-specific explanations. The project required strong subject knowledge and bilingual clarity, as many tasks were completed in both English and Hindi. I labeled and refined more than 3,000 STEM tasks, ensuring mathematical accuracy, scientific correctness, and logical consistency. Quality measures included verifying formulas, checking step-by-step reasoning, adhering to domain rubrics, and maintaining cross-language precision.

2023
Scale AI

Image Annotation

Scale AIImageRLHFPrompt Response Writing SFT
I worked on multimodal AI projects where I created prompts for images and evaluated model responses based on visual understanding. This included writing descriptive and challenging prompts about image content to test model perception, assessing whether the model identified objects and scenes correctly, and reviewing its reasoning or explanations. I also checked visual-text alignment, verified factual correctness, and ensured that the model’s answers matched the image context. All work followed strict annotation guidelines, consistency checks, and quality standards across multiple batches of image-based tasks.

I worked on multimodal AI projects where I created prompts for images and evaluated model responses based on visual understanding. This included writing descriptive and challenging prompts about image content to test model perception, assessing whether the model identified objects and scenes correctly, and reviewing its reasoning or explanations. I also checked visual-text alignment, verified factual correctness, and ensured that the model’s answers matched the image context. All work followed strict annotation guidelines, consistency checks, and quality standards across multiple batches of image-based tasks.

2024 - 2025
Labelbox

Multilingual Text & LLM Annotation Project

LabelboxTextRLHFPrompt Response Writing SFT
I worked on a large-scale multilingual NLP project focused on training and evaluating AI language models in English and Hindi. The scope of the project involved writing diverse prompts, categorizing text, evaluating AI-generated responses, checking summarizations, reviewing translations, and performing safety and content-moderation assessments. I annotated and reviewed over 10,000 text items, covering dialogue tasks, reasoning tasks, and classification tasks. Throughout the project, I adhered to detailed annotation guidelines, rubric-based scoring frameworks, and multi-level quality checks to ensure accuracy, consistency, and alignment with linguistic and safety standards.

I worked on a large-scale multilingual NLP project focused on training and evaluating AI language models in English and Hindi. The scope of the project involved writing diverse prompts, categorizing text, evaluating AI-generated responses, checking summarizations, reviewing translations, and performing safety and content-moderation assessments. I annotated and reviewed over 10,000 text items, covering dialogue tasks, reasoning tasks, and classification tasks. Throughout the project, I adhered to detailed annotation guidelines, rubric-based scoring frameworks, and multi-level quality checks to ensure accuracy, consistency, and alignment with linguistic and safety standards.

2023 - 2023

Education

U

University of Delhi

Bachelor of Science, Mathematics

Bachelor of Science
2018 - 2021

Work History

A

Accenture

Quality Assurance Analyst

Noida
2022 - 2023
Q

Quizzy, MathMaster, CourseHero

STEM Tutor & Content Developer

Delhi
2021 - 2022