For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
H
Harsha Shah

Harsha Shah

AI Training Data Specialist | Prompt Engineer | LLM Evaluator

Nepal flagKathmandu, Nepal
$25.00/hrIntermediateLabelboxCrowdsourceCVAT

Key Skills

Software

LabelboxLabelbox
CrowdSourceCrowdSource
CVATCVAT
AppenAppen

Top Subject Matter

Generative AI – Prompt Engineering, LLM Training & Response Evaluation
AI Systems – Agent Workflows, Tool Integration & Task Optimization
AI Safety & Data Quality – Bias Testing, Adversarial Prompting & Model Evaluation

Top Data Types

TextText
ImageImage
Medical DicomMedical Dicom

Top Task Types

Bounding BoxBounding Box
SegmentationSegmentation
ClassificationClassification
Object DetectionObject Detection
Text GenerationText Generation
Question AnsweringQuestion Answering
Text SummarizationText Summarization
RLHFRLHF
Fine-tuningFine-tuning
Evaluation/RatingEvaluation/Rating
Computer Programming/CodingComputer Programming/Coding
Data CollectionData Collection

Freelancer Overview

My experience in AI training data and data labeling is rooted in a strong academic foundation and hands-on industry work. After completing my MS in AI Applications from the University of Strathclyde in Glasgow, I began working with Outlier AI and Stellar AI as a freelance contributor, focusing on data training and collection. In these roles, I developed the ability to analyze datasets critically, identify quality issues, and refine training inputs to improve model outputs. My work involved evaluating LLM responses for accuracy, coherence, and instruction adherence, ensuring that outputs met high-quality standards across a variety of use cases. In addition to evaluation, I specialize in AI prompt engineering—designing and optimizing prompts to enhance model performance across diverse tasks. I have experience in adversarial prompting and testing, where I intentionally stress-test models to uncover weaknesses and guide improvements. I’ve also worked on AI agent task optimization, assessing how models interact with external tools like search and computational systems. A strong focus of my work has been bias and safety testing, ensuring responses remain accurate, fair, and ethically aligned. This combination of analytical rigor, practical experience, and a deep understanding of LLM behavior allows me to contribute effectively to building more reliable and robust AI systems.

IntermediateEnglish

Labeling Experience

Senior Reviewer

TextEvaluation Rating
During this project, I was promoted as reviewer reviewing the performance of other evaluators, correcting their mistakes according to the manual and providing them constructive feedback.

During this project, I was promoted as reviewer reviewing the performance of other evaluators, correcting their mistakes according to the manual and providing them constructive feedback.

2023 - 2025

Data Collection For Mental Health Chatbot

TextData Collection
This was a small project for a startup called algotomy specializing on medical data. I was tasked with collection of empathetic responses to patients problems. For this project public mental health forums and various reddit subs were scraped in response, reply 2X2 rows using Python. Unsuitable rows were manually removed and the company was provided quality dataset for building model.

This was a small project for a startup called algotomy specializing on medical data. I was tasked with collection of empathetic responses to patients problems. For this project public mental health forums and various reddit subs were scraped in response, reply 2X2 rows using Python. Unsuitable rows were manually removed and the company was provided quality dataset for building model.

2024 - 2024

Data Labelling Generalist

TextEvaluation Rating
In this project, chat history between user and AI model was studied. This project was multi-faceted requiring: AI Prompt Engineering – Designing and optimizing effective prompts to improve LLM performance across diverse tasks. LLM Response Evaluation – Assessing AI-generated outputs based on accuracy, coherence, instruction adherence, etc. Adversarial Prompting & Testing – Crafting structured prompts to deliberately fail the model and identify model weaknesses to guide improvements. AI Agent Task Optimization – Evaluating agent behavior in tool-assisted workflows (Google Search, Maps, Wolfram). Bias & Safety Testing – Ensuring responses are accurate, unbiased, and ethically aligned.

In this project, chat history between user and AI model was studied. This project was multi-faceted requiring: AI Prompt Engineering – Designing and optimizing effective prompts to improve LLM performance across diverse tasks. LLM Response Evaluation – Assessing AI-generated outputs based on accuracy, coherence, instruction adherence, etc. Adversarial Prompting & Testing – Crafting structured prompts to deliberately fail the model and identify model weaknesses to guide improvements. AI Agent Task Optimization – Evaluating agent behavior in tool-assisted workflows (Google Search, Maps, Wolfram). Bias & Safety Testing – Ensuring responses are accurate, unbiased, and ethically aligned.

2022 - 2023

STEM Specialization Data Annotation

TextRLHF
The scope of this project was in Physics and Chemistry domain. The questions and responses from the LLM were individually evaluated for factual and mathematical correctness.

The scope of this project was in Physics and Chemistry domain. The questions and responses from the LLM were individually evaluated for factual and mathematical correctness.

2022 - 2022

Education

U

University of Strathclyde

Master's in Science, AI and Applications

Master's in Science
2021 - 2022
K

Kathmandu University

Bachelor in Engineering, Chemical Engineering

Bachelor in Engineering
2015 - 2019

Work History

A

Algotomy

Data Collection and Cleaning

Kathmandu
2024 - 2024
O

Outlier AI

Senior Reviewer

Remote
2022 - 2023