For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Muneeb Ul  haq

Muneeb Ul haq

Prompt Engineer & AI Content Evaluator - Technology & Internet

India flagSrinagar, India
$23.55/hrIntermediateAppenCVATData Annotation Tech

Key Skills

Software

AppenAppen
CVATCVAT
Data Annotation TechData Annotation Tech
iMeritiMerit
LabelboxLabelbox
Label StudioLabel Studio
RemotasksRemotasks
Scale AIScale AI
SuperAnnotateSuperAnnotate
Surge AISurge AI
TelusTelus

Top Subject Matter

No subject matter listed

Top Data Types

ImageImage
TextText
AudioAudio

Top Task Types

Polygon
Segmentation
Object Detection
Bounding Box
Tracking
Classification
Entity Ner Classification
Emotion Recognition
Translation Localization
Prompt Response Writing SFT
Question Answering
Text Generation
Text Summarization
Evaluation Rating
RLHF

Freelancer Overview

I am a detail-oriented professional with hands-on experience in AI data labeling, annotation, and large language model (LLM) content evaluation. My background includes evaluating and scoring AI-generated text and images using rubrics-based frameworks, working with Supervised Fine Tuning (SFT), RLHF, RLVR, and model alignment projects. I am skilled in text classification, sentiment analysis, entity tagging, and chatbot rating, with a strong focus on accuracy, consistency, and analytical reasoning. My multidisciplinary education in science and certifications in AI, data analysis, and cybersecurity enable me to adapt quickly to new tools and guidelines. I am passionate about contributing to high-quality AI training data and improving model performance in dynamic, remote work environments.

IntermediateHindiUrduEnglish

Labeling Experience

Scale AI

LLM Content Evaluator and AI Data Labeler

Scale AITextEvaluation Rating
I evaluated and rated LLM-generated content for quality, accuracy, and alignment using rubrics-based frameworks. Tasks included scoring outputs, analyzing content for safety and tone, and providing feedback for improvement in model responses. The experience involved Supervised Fine Tuning (SFT), RLHF, and RLVR workflows for comprehensive model assessment. • Evaluated chatbot and general AI outputs on criteria such as reasoning depth and prompt alignment. • Performed reinforcement learning-based reward signal evaluations to enhance alignment. • Labeled data across multiple batches, contributing directly to instruction dataset quality. • Utilized various online productivity and annotation tools for remote AI labeling work.

I evaluated and rated LLM-generated content for quality, accuracy, and alignment using rubrics-based frameworks. Tasks included scoring outputs, analyzing content for safety and tone, and providing feedback for improvement in model responses. The experience involved Supervised Fine Tuning (SFT), RLHF, and RLVR workflows for comprehensive model assessment. • Evaluated chatbot and general AI outputs on criteria such as reasoning depth and prompt alignment. • Performed reinforcement learning-based reward signal evaluations to enhance alignment. • Labeled data across multiple batches, contributing directly to instruction dataset quality. • Utilized various online productivity and annotation tools for remote AI labeling work.

2025
Scale AI

SparrowSignet, Instruction Writing & Rubric-Based Evaluation

Scale AITextText GenerationEvaluation Rating
Contributed to high-quality training data for instruction-tuned LLMs. Work included: Writing detailed, domain-diverse prompts and solutions. Creating structured rubrics for evaluating model responses. Performing rubric-based scoring for correctness, safety, and reasoning depth. Enhancing prompt clarity, diversity, and difficulty to meet evolving data standards.

Contributed to high-quality training data for instruction-tuned LLMs. Work included: Writing detailed, domain-diverse prompts and solutions. Creating structured rubrics for evaluating model responses. Performing rubric-based scoring for correctness, safety, and reasoning depth. Enhancing prompt clarity, diversity, and difficulty to meet evolving data standards.

2025
Scale AI

Aether, LLM Response Ranking & Evaluation

Scale AITextRLHFEvaluation Rating
Evaluated and ranked multiple AI-generated responses based on correctness, helpfulness, safety, and alignment with project guidelines. Responsibilities included: Scoring multi-turn conversations using fine-grained rubrics. Identifying safety risks, hallucinations, and policy violations. Selecting the best responses to train reward models (RLHF). Maintaining high consistency across edge cases and ambiguous queries.

Evaluated and ranked multiple AI-generated responses based on correctness, helpfulness, safety, and alignment with project guidelines. Responsibilities included: Scoring multi-turn conversations using fine-grained rubrics. Identifying safety risks, hallucinations, and policy violations. Selecting the best responses to train reward models (RLHF). Maintaining high consistency across edge cases and ambiguous queries.

2025
Scale AI

Guitar Riff, Creative Prompt Generation

Scale AITextText GenerationEvaluation Rating
Worked on generating diverse, high-quality creative prompts for a music-themed dataset used to train generative AI models. Tasks included: Writing original prompts for guitar riffs in multiple styles and moods. Calibrating difficulty and creativity levels according to project guidelines. Ensuring linguistic diversity, clarity, and model-compatible formatting. Reviewing and refining prompt–response pairs for coherence.

Worked on generating diverse, high-quality creative prompts for a music-themed dataset used to train generative AI models. Tasks included: Writing original prompts for guitar riffs in multiple styles and moods. Calibrating difficulty and creativity levels according to project guidelines. Ensuring linguistic diversity, clarity, and model-compatible formatting. Reviewing and refining prompt–response pairs for coherence.

2025
Surge AI

Guitar Pinstripe Prompt Generation, Text Polishing & Prompt Alignment (Outlier AI)

Surge AITextClassificationQuestion Answering
I am currently working on the Guitar Pinstripe project on Outlier AI, polishing and refining text responses so they sound natural and human. Edited grammar, fluency, tone, and creativity while ensuring every response strictly follows the assigned prompt. Evaluated whether the model fulfilled the instruction and corrected unnatural phrasing, punctuation, and off-prompt content. I also assisted in improving dataset quality for training large language models.

I am currently working on the Guitar Pinstripe project on Outlier AI, polishing and refining text responses so they sound natural and human. Edited grammar, fluency, tone, and creativity while ensuring every response strictly follows the assigned prompt. Evaluated whether the model fulfilled the instruction and corrected unnatural phrasing, punctuation, and off-prompt content. I also assisted in improving dataset quality for training large language models.

2025

Education

I

Indira Ghandhi National University

Masters, Public Administration

Masters
2025 - 2025
W

World Quant University

Master of Science, Financial Engineering

Master of Science
2024 - 2025

Work History

S

Scale Ai

Data Annotator/Prompt Engineer

Srinagar
2024 - Present
S

Sourch Technologies Pvt. Ltd.

Business Development Associate

sopore
2022 - 2023