For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Darius Gathungu

Darius Gathungu

AI Training Contributor - Large Language Models

USA flag
new york, Usa
$15.00/hrExpertAws SagemakerMercorData Annotation Tech

Key Skills

Software

AWS SageMakerAWS SageMaker
MercorMercor
Data Annotation TechData Annotation Tech
CloudFactoryCloudFactory

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
Computer Code ProgrammingComputer Code Programming
ImageImage
TextText
VideoVideo

Top Label Types

Fine Tuning
Evaluation Rating
Computer Programming Coding
Data Collection
Land Cover Classification
RLHF
Red Teaming

Freelancer Overview

I am an experienced AI training contributor and full-stack developer with a strong background in data labeling, annotation, and large language model (LLM) evaluation. My expertise includes prompt engineering, RLHF workflows, and structured data validation using Python, Django, and REST APIs. I have evaluated and ranked thousands of AI-generated outputs for factual accuracy, reasoning quality, and policy compliance, consistently maintaining high QA standards. My hands-on projects range from developing automated prompt evaluation frameworks and output consistency analyzers to building web-based annotation dashboards for grading and exporting model responses. I am skilled at identifying hallucinations, bias, and logical inconsistencies, and I leverage automation to improve evaluation throughput and data quality across technical and analytical domains.

ExpertEnglish

Labeling Experience

CloudFactory

Structured Output Validation Initiative (Annotation Project)

CloudfactoryTextLand Cover ClassificationRLHF
In the Structured Output Validation Initiative, I validated JSON-formatted model outputs against schema and logical requirements. My responsibilities included identifying formatting inconsistencies and logical errors within outputs. I also generated feedback logs to support iterative model training improvements. • Performed validation against JSON schema standards. • Flagged format and logic errors for dataset refinement. • Produced structured logs for training teams. • Enhanced dataset quality through targeted review.

In the Structured Output Validation Initiative, I validated JSON-formatted model outputs against schema and logical requirements. My responsibilities included identifying formatting inconsistencies and logical errors within outputs. I also generated feedback logs to support iterative model training improvements. • Performed validation against JSON schema standards. • Flagged format and logic errors for dataset refinement. • Produced structured logs for training teams. • Enhanced dataset quality through targeted review.

2024
Data Annotation Tech

AI Safety Review and Moderation Task (Annotation Project)

Data Annotation TechImageLand Cover ClassificationRLHF
In the AI Safety Review and Moderation Task, I reviewed AI-generated responses to detect bias, harmful content, and policy non-compliance. Ambiguous or edge cases were escalated with detailed reporting. I also recommended alternative language while upholding instructional integrity. • Conducted content review and moderation for bias and safety. • Flagged and escalated policy violations. • Provided constructive feedback for safer outputs. • Helped improve policy adherence in training datasets.

In the AI Safety Review and Moderation Task, I reviewed AI-generated responses to detect bias, harmful content, and policy non-compliance. Ambiguous or edge cases were escalated with detailed reporting. I also recommended alternative language while upholding instructional integrity. • Conducted content review and moderation for bias and safety. • Flagged and escalated policy violations. • Provided constructive feedback for safer outputs. • Helped improve policy adherence in training datasets.

2024
Mercor

LLM Comparative Ranking Project (Annotation Project)

MercorComputer Code ProgrammingEvaluation RatingData Collection
In the LLM Comparative Ranking Project, I assessed and ranked multiple candidate responses using a predefined quality rubric. I provided structured justifications for evaluation decisions and ensured consistency across annotation tasks. This project emphasized fair and systematic ranking for dataset training purposes. • Used comparative scoring to evaluate language model outputs. • Documented rationale for score assignments. • Ensured reliability and consistency over evaluation cycles. • Supported dataset curation through high-quality feedback.

In the LLM Comparative Ranking Project, I assessed and ranked multiple candidate responses using a predefined quality rubric. I provided structured justifications for evaluation decisions and ensured consistency across annotation tasks. This project emphasized fair and systematic ranking for dataset training purposes. • Used comparative scoring to evaluate language model outputs. • Documented rationale for score assignments. • Ensured reliability and consistency over evaluation cycles. • Supported dataset curation through high-quality feedback.

2024
AWS SageMaker

AI Data Labeler / Training Contributor (Contract)

Aws SagemakerVideoFine TuningEvaluation Rating
As an AI Data Labeler and Training Contributor, I evaluated and annotated AI-generated text responses using structured grading rubrics. My work involved detailed analysis for accuracy, coherence, formatting adherence, and AI safety compliance, including the identification of hallucinations and policy violations. I also revised low-quality outputs and maintained high-quality assurance standards throughout large annotation workflows. • Labeled and reviewed over 2,000 AI-generated text responses. • Applied rubric-based scoring and conducted comparative response ranking. • Identified edge-case failures and contributed to dataset improvement initiatives. • Maintained high approval rates and facilitated training feedback via structured logs.

As an AI Data Labeler and Training Contributor, I evaluated and annotated AI-generated text responses using structured grading rubrics. My work involved detailed analysis for accuracy, coherence, formatting adherence, and AI safety compliance, including the identification of hallucinations and policy violations. I also revised low-quality outputs and maintained high-quality assurance standards throughout large annotation workflows. • Labeled and reviewed over 2,000 AI-generated text responses. • Applied rubric-based scoring and conducted comparative response ranking. • Identified edge-case failures and contributed to dataset improvement initiatives. • Maintained high approval rates and facilitated training feedback via structured logs.

2024

Education

U

University of Texas at Dallas

Bachelor of Science, Information Technology

Bachelor of Science
2022 - 2022

Work History

I

Independent Contractor

Full-Stack Developer

Austin
2023 - Present