Irfan Khan - Expert in prompt engineering & data labeling in STEM

Key Skills

Software

Appen

Labelbox

Mindrift

Other

Top Subject Matter

No subject matter listed

Top Data Types

Document

Image

Text

Top Task Types

Classification

Evaluation Rating

Object Detection

RLHF

Text Generation

Freelancer Overview

Over the past two years, I’ve contributed to a wide range of high-impact AI training projects across Outlier, Alignerr (Labelbox), and Appen – Crowdgen. My experience includes writing prompts, crafting rubrics, evaluating model responses, and conducting fine-grained data annotation across domains like biology, chemistry, physics, coding, and general reasoning. Notable Outlier projects include Iris Gambit (MCQ generation and rubric writing), Phoenix (rubric evaluation across subdomains), Cracked Vault (prompt writing from research abstracts), and Thales Tales (chain-of-thought model rewrites). My work has consistently prioritized MECE-compliant criteria, formatting accuracy, and deep model understanding. In addition, I’ve contributed to TTS and function-calling evaluations in Alignerr (Labelbox) and conversational response projects like Spearmint and Coffee at Appen. My strengths include precision prompt engineering, GTFA-based reasoning analysis, and clear, guideline-aligned labeling. I’m adept at identifying edge cases and model failure modes, and ensuring consistently high data quality across complex and evolving instructions. This mix of creativity, analytical thinking, and instruction adherence makes me an effective and versatile contributor to any AI training pipeline.

IntermediateUrduHindiFrenchEnglishPunjabi

Labeling Experience

Function Calling – Image Annotation & Validation (Alignerr)

LabelboxVideoFunction Calling

Annotated images for function-calling tasks by linking visual elements to textual function prompts, ensuring alignment between model outputs and visual context. Evaluated correctness of function calls generated by the model (e.g., coordinates, labels, object presence), validated API-ready responses, and corrected function arguments to match visual content. Adhered to detailed client rubrics for visual accuracy, clarity, and API syntax.

2025 - 2025

CrowdGEN – Research Q&A and Distractor Writing (Appen)

AppenTextQuestion Answering

Created real-world event-based MCQs with correct insights and plausible distractors. Topics covered included global markets, regulatory decisions, and corporate performance. Followed a structured rubric to ensure clarity, neutrality, and informative value across hundreds of items.

2025 - 2025

LLM Prompt Evaluation – Outlier (Phoenix Project)

OtherTextClassification

Evaluated and rated over 1,000 LLM-generated responses using detailed quality rubrics across physics, engineering, and general domains. Identified factual inconsistencies, ambiguity, and formatting issues. Suggested improved prompts/responses and conducted final quality audits as part of gold-standard calibration sets.

2025 - 2025

Education

U

University of Engineering & Technology, Lahore

Master of Science, Polymer & Process Engineering

Master of Science

2020 - 2022

U

Université de Montpellier; VSCHT

Erasmus Mundus Master, Membrane Engineering for Sustainable World

Erasmus Mundus Master

2018 - 2019

Work History

A

ALignerr

LLM Data Annotation

Sydney

2025 - Present

A

Appen - Crowdgen

AI Content Evaluator

Sydney

2025 - Present