For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
W

William Eniola

AI Researcher – Benchmark Dataset Labeling & Evaluation

Nigeria flagekiti, Nigeria
$25.00/hrIntermediateAppen

Key Skills

Software

AppenAppen

Top Subject Matter

LLM alignment
model robustness
bias evaluation

Top Data Types

TextText
ImageImage
DocumentDocument

Top Task Types

RLHFRLHF

Freelancer Overview

AI Researcher – Benchmark Dataset Labeling & Evaluation. Brings 2+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Hugging Face and Appen. Education includes Bachelor of Science, New York University, Tandon School of Engineering (2020) and Graduate Certificate, Columbia University, School of Professional Studies (2021). AI-training focus includes data types such as Text and labeling workflows including Evaluation, Rating, and RLHF.

IntermediateEnglishYoruba

Labeling Experience

AI Researcher – Benchmark Dataset Labeling & Evaluation

Text
As an AI Researcher, I designed and implemented benchmark datasets for large language model evaluation. I focused on measuring AI model performance, alignment, and hallucination mitigation strategies. My work contributed to open research communities and responsible AI deployment frameworks. • Created factual accuracy, bias, and robustness benchmarks for LLMs • Used Python-based pipelines (Hugging Face, LangChain, PyTorch) for dataset curation and labeling • Collaborated with interdisciplinary teams to ensure diverse evaluation criteria • Shared datasets and evaluations with research and stakeholder audiences.

As an AI Researcher, I designed and implemented benchmark datasets for large language model evaluation. I focused on measuring AI model performance, alignment, and hallucination mitigation strategies. My work contributed to open research communities and responsible AI deployment frameworks. • Created factual accuracy, bias, and robustness benchmarks for LLMs • Used Python-based pipelines (Hugging Face, LangChain, PyTorch) for dataset curation and labeling • Collaborated with interdisciplinary teams to ensure diverse evaluation criteria • Shared datasets and evaluations with research and stakeholder audiences.

2025 - Present
Appen

AI Data Annotator & QA Lead – RLHF and QA for LLMs

AppenTextRLHF
As an AI Data Annotator and QA Lead at Appen, I annotated and quality-reviewed over 60,000 data points for AI research clients. My work involved RLHF evaluation tasks for large language model providers. I managed a team of annotators and enforced data quality standards. • Specialized in Reinforcement Learning from Human Feedback (RLHF) tasks • Rated and ranked model responses for helpfulness, accuracy, and safety • Used Appen's platform and maintained high inter-annotator agreement • Promoted to QA Lead and ensured guideline compliance across the team.

As an AI Data Annotator and QA Lead at Appen, I annotated and quality-reviewed over 60,000 data points for AI research clients. My work involved RLHF evaluation tasks for large language model providers. I managed a team of annotators and enforced data quality standards. • Specialized in Reinforcement Learning from Human Feedback (RLHF) tasks • Rated and ranked model responses for helpfulness, accuracy, and safety • Used Appen's platform and maintained high inter-annotator agreement • Promoted to QA Lead and ensured guideline compliance across the team.

2022 - 2024

Education

C

Columbia University, School of Professional Studies

Graduate Certificate, Artificial Intelligence and Machine Learning

Graduate Certificate
2021 - 2021
N

New York University, Tandon School of Engineering

Bachelor of Science, Computer Science

Bachelor of Science
2016 - 2020

Work History

I

Independent

AI Researcher

New York
2025 - Present
N

NIGE

Project Manager – Technology & AI Initiatives

New York
2024 - 2024