Suhaib Khan - AI Labeling & LLM Evaluation (Project Lorem)

Key Skills

Software

Mercor

Other

Top Subject Matter

Emotional support

AI empathy

therapeutic response modeling

Top Data Types

Audio

Document

Image

Text

Video

Top Task Types

Audio Recording

Classification

Data Collection

Evaluation Rating

Prompt Response Writing SFT

Text Summarization

Freelancer Overview

I have hands-on experience in AI training data and large language model (LLM) evaluation, with a focus on rubric-based assessment of model outputs for factuality, reasoning quality, helpfulness, and alignment. In my role as a Freelance AI Systems Evaluation & Quality Analyst, I have worked extensively on dataset labeling, comparative response ranking, and quality validation across multiple domains. I am experienced in identifying errors, edge cases, and inconsistencies, and applying structured evaluation frameworks to ensure high-quality training data. I have contributed to projects involving emotional support response evaluation, AI-generated insights analysis, and speech dataset validation, applying strong analytical reasoning and attention to detail throughout. My workflow experience includes managing high-volume review tasks under tight deadlines while maintaining consistency and accuracy. With a background in Cyber Security and technical proficiency in tools such as Python and Airtable, I bring a disciplined, data-driven approach to AI model evaluation and continuous improvement.

IntermediateEnglish

Labeling Experience

Project Voiceover (Accents)

MercorAudioAudio Recording

Successfully completed a task-based AI data collection project, performing 210 live phone-call recordings across multiple speaking variations (normal, speakerphone, muffled, quiet, and natural pauses) Tested and applied different vocal and speech patterns to simulate real-world customer interactions for AI evaluation Ensured data accuracy, structured documentation, and workflow compliance for each script and variation Gained experience in remote project execution, quality assurance, and problem-solving Strengthened process-driven, analytical, and organizational abilities, applicable to IT, administrative, and data evaluation tasks

2025 - 2026

Project Lorem

TextPrompt Response Writing SFT

Contributed to Project Lorem, evaluating AI-generated emotional support responses to improve therapeutic quality and human-aligned model behavior Assessed model outputs across structured evaluation axes including empathy, validation, emotional attunement, and psychological safety Completed structured onboarding and adhered to project-specific evaluation rubrics and quality standards Collaborated with the Mercor team via Slack and internal platforms to ensure consistent, high-quality preference data for model optimisation

2026 - 2026

Project Meereen

TextPrompt Response Writing SFT

Contributed to an AI evaluation project by writing realistic user prompts and assessing AI-generated responses across multiple quality dimensions Ensured high-quality output by applying structured evaluation criteria, providing feedback to improve AI model performance Collaborated with the Mercor team via Slack and internal platforms to maintain consistent, high-quality data for model optimisation Participated in fast-paced project sprints, balancing accuracy, speed, and adherence to confidentiality standards

2026 - 2026

Sports AI Insights Evaluation

OtherTextPrompt Response Writing SFT

Evaluated AI-generated pre-match insights for upcoming football matches across multiple leagues, assessing helpfulness, factual accuracy, clarity, and trustworthiness Completed structured tasks at two pre-match intervals (8 hours and 45 minutes before kick-off), applying rigorous guidelines and rating criteria Managed a time-sensitive multi-task workflow, converting task release times across time zones and tracking potential pay per task Utilised Airtable to organise, track, and manage data efficiently, ensuring smooth workflow and timely completion of assessments Gained experience in analytical review, sports domain evaluation, and quality assurance for AI outputs

2025 - 2026

Generalist AI/Data Contributor

MercorImageClassification

Reviewed and labeled image, textual datasets following structured annotation guidelines to ensure accuracy and consistency. Maintained documentation rigor and supported high-quality AI data outputs. Key responsibilities included: Labeled and categorized diverse text and image data according to predefined classes. Validated and corrected labeling inconsistencies to maintain data integrity. Utilized annotation tools to streamline workflow and improve efficiency. Communicated findings and collaborated with remote project teams.

2025 - 2025

Education

K

Kingston University

Bachelor of Science, Cyber Security and Digital Forensics

Bachelor of Science

2021 - 2024

W

West Thames College

BTEC Level 3 Extended Diploma, Information Technology Systems Support

BTEC Level 3 Extended Diploma

2019 - 2021

Work History

M

Mercor AI

AI Systems Evaluation & Quality Analyst

London

2025 - Present

S

Sweet Affairs

Administrator

London

2021 - 2024