For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Abdul Waheed

Abdul Waheed

AI/ML Engineer | Generative & Agentic AI Specialist | LLM Evaluation Expert

PAKISTAN flag
Lahore, Pakistan
$20.00/hrIntermediateAws SagemakerAppenGoogle Cloud Vertex AI

Key Skills

Software

AWS SageMakerAWS SageMaker
AppenAppen
Google Cloud Vertex AIGoogle Cloud Vertex AI
Label StudioLabel Studio
OpenCV AI Kit (OAK)OpenCV AI Kit (OAK)
Other
Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

ImageImage
TextText

Top Label Types

Evaluation Rating
Prompt Response Writing SFT
RLHF
Text Summarization
Translation Localization

Freelancer Overview

AI/ML Engineer with strong expertise in Generative AI, RLHF, and SFT-based LLM evaluation, contributing to large-scale initiatives like Google GAIA and Twitter AI (Grok). Skilled at crafting, curating, and validating datasets that bridge human reasoning and machine learning from multimodal search workflows to academic question–answer generation. At Google GAIA, I developed datasets modeling how humans search, analyze, and reason step-by-step across text, image, video, and audio modalities. This work helped train Gemini models to understand multi-step problem solving, compare model reasoning paths against human logic, and refine outputs for accuracy and coherence. At Twitter AI, I contributed to RLHF pipelines by designing prompts, reviewing Grok model responses, identifying reasoning gaps, and authoring ideal completions. This process strengthened model alignment and reduced hallucinations. Overall, my focus lies in building high-quality, explainable data that drives human-like reasoning in AI systems, ensuring fairness, linguistic diversity, and real-world performance.

IntermediateUrduPunjabiEnglish

Labeling Experience

Google GAIA – Human Search Reasoning and Multimodal Dataset Creation

Internal Proprietary ToolingImageText GenerationRLHF
Worked under Turing as part of the Google GAIA (Gemini) data team to create high-quality datasets for training multimodal LLMs. Designed and annotated human reasoning sequences that replicate how users search, analyze, and synthesize information step by step across text, image, video, and audio inputs. Labeled reasoning chains, extracted relevant data, and wrote human-like solutions to teach models structured thinking. Validated model reasoning by comparing outputs with human-generated logic under RLHF and SFT frameworks. Ensured data accuracy, consistency, and diversity across modalities.

Worked under Turing as part of the Google GAIA (Gemini) data team to create high-quality datasets for training multimodal LLMs. Designed and annotated human reasoning sequences that replicate how users search, analyze, and synthesize information step by step across text, image, video, and audio inputs. Labeled reasoning chains, extracted relevant data, and wrote human-like solutions to teach models structured thinking. Validated model reasoning by comparing outputs with human-generated logic under RLHF and SFT frameworks. Ensured data accuracy, consistency, and diversity across modalities.

2024

Twitter AI (Grok) – Prompt Evaluation and Ideal Completion Generation

Internal Proprietary ToolingComputer Code ProgrammingText GenerationRLHF
Contracted through Turing for the Twitter AI (Grok) team to support reinforcement learning with human feedback (RLHF). Evaluated model responses to diverse user prompts, identified factual and logical inconsistencies, and created ideal responses to guide retraining. Collaborated with internal QA teams to assess alignment, tone, and factual accuracy. Provided data-driven feedback that improved model performance, coherence, and user satisfaction.

Contracted through Turing for the Twitter AI (Grok) team to support reinforcement learning with human feedback (RLHF). Evaluated model responses to diverse user prompts, identified factual and logical inconsistencies, and created ideal responses to guide retraining. Collaborated with internal QA teams to assess alignment, tone, and factual accuracy. Provided data-driven feedback that improved model performance, coherence, and user satisfaction.

2024 - 2024

Education

C

COMSATS University Islamabad

Bachelor of Science, Computer Science

Bachelor of Science
2020 - 2024

Work History

H

Hubble42 Inc.

AI/ML Engineer - Generative AI

Lahore
2024 - Present
C

codSeed

AI/ML Engineer

Lahore
2023 - 2024