For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
E

Eric Kiruja

AI Trainer / LLM Evaluator (Remote, Contract) at Multiple AI Companies

ExpertRemotasksOtherMercor

Key Skills

Software

RemotasksRemotasks
Other
MercorMercor
TelusTelus

Top Subject Matter

Large Language Models
AI Safety
Nlp Domain Expertise

Top Data Types

TextText
DocumentDocument

Top Task Types

Entity Ner Classification

Freelancer Overview

AI Trainer / LLM Evaluator (Remote, Contract) at Multiple AI Companies. Brings 6+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Remotasks, Other, and Mercor. Education includes Bachelor of Arts, Arizona State University (2023). AI-training focus includes data types such as Text and labeling workflows including Evaluation, Rating, and Entity (NER) Classification.

Expert

Labeling Experience

Mercor

AI Output Assessor at Mercor

MercorText
At Mercor, I performed comparative assessments of AI-generated outputs with a focus on safety and guideline compliance. My contributions included ranking model responses and flagging unsafe or unclear outputs. Rigorous application of project rules was essential. • Assessed AI outputs for clarity and policy adherence. • Ranked model responses competitively. • Flagged and documented noncompliant or unsafe content. • Produced evaluation reports for process improvement.

At Mercor, I performed comparative assessments of AI-generated outputs with a focus on safety and guideline compliance. My contributions included ranking model responses and flagging unsafe or unclear outputs. Rigorous application of project rules was essential. • Assessed AI outputs for clarity and policy adherence. • Ranked model responses competitively. • Flagged and documented noncompliant or unsafe content. • Produced evaluation reports for process improvement.

2024 - Present

LLM Evaluator at Outlier AI

OtherText
At Outlier AI, I performed advanced LLM evaluation focusing on response ranking and error analysis. My primary tasks involved assessing the depth of reasoning and factual reliability of model outputs. This position required high-level attention to detail and strong analytical judgment. • Evaluated LLM responses for complex reasoning and factual consistency. • Identified and documented reasoning errors and hallucinations. • Prioritized response ranking to ensure model reliability. • Ensured alignment with client evaluation standards.

At Outlier AI, I performed advanced LLM evaluation focusing on response ranking and error analysis. My primary tasks involved assessing the depth of reasoning and factual reliability of model outputs. This position required high-level attention to detail and strong analytical judgment. • Evaluated LLM responses for complex reasoning and factual consistency. • Identified and documented reasoning errors and hallucinations. • Prioritized response ranking to ensure model reliability. • Ensured alignment with client evaluation standards.

2024 - Present
Remotasks

AI Trainer / LLM Evaluator (Remote, Contract) at Multiple AI Companies

RemotasksText
I evaluated and ranked AI-generated responses to enhance accuracy, clarity, and compliance in large language models. I conducted adversarial testing, fact-checking, and provided written feedback to guide model improvements. My responsibilities included adapting to project guidelines while maintaining high-quality data creation and evaluation processes. • Created and refined prompt–response pairs for supervised fine-tuning and RLHF workflows. • Ranked and assessed model outputs for factuality, safety, and relevance. • Fact-checked and verified training data using trusted sources. • Provided detailed qualitative feedback for model improvement.

I evaluated and ranked AI-generated responses to enhance accuracy, clarity, and compliance in large language models. I conducted adversarial testing, fact-checking, and provided written feedback to guide model improvements. My responsibilities included adapting to project guidelines while maintaining high-quality data creation and evaluation processes. • Created and refined prompt–response pairs for supervised fine-tuning and RLHF workflows. • Ranked and assessed model outputs for factuality, safety, and relevance. • Fact-checked and verified training data using trusted sources. • Provided detailed qualitative feedback for model improvement.

2021 - Present

Text Annotation Specialist at DataForce (TransPerfect)

OtherTextEntity Ner Classification
At DataForce (TransPerfect), I was responsible for annotating and validating NLP training data. I ensured data quality and integrity for production-level AI models. This role required diligence and thorough review of language data. • Annotated text data according to entity guidelines. • Conducted quality assurance checks on labeled datasets. • Validated NLP data with multiple review cycles. • Contributed to scalable data pipelines for AI training.

At DataForce (TransPerfect), I was responsible for annotating and validating NLP training data. I ensured data quality and integrity for production-level AI models. This role required diligence and thorough review of language data. • Annotated text data according to entity guidelines. • Conducted quality assurance checks on labeled datasets. • Validated NLP data with multiple review cycles. • Contributed to scalable data pipelines for AI training.

2022 - 2024

AI Data Quality Analyst at e2f / TrustScale

OtherText
At e2f / TrustScale, I contributed to AI data quality analysis and content verification for language datasets. My responsibilities included evaluating data for consistency and verifying content accuracy. Attention to linguistic detail and policy alignment were essential. • Reviewed and validated large-scale language data for errors. • Performed content verification and compliance checks. • Ensured data met quality and policy standards. • Documented and reported quality metrics to stakeholders.

At e2f / TrustScale, I contributed to AI data quality analysis and content verification for language datasets. My responsibilities included evaluating data for consistency and verifying content accuracy. Attention to linguistic detail and policy alignment were essential. • Reviewed and validated large-scale language data for errors. • Performed content verification and compliance checks. • Ensured data met quality and policy standards. • Documented and reported quality metrics to stakeholders.

2022 - 2023

Education

A

Arizona State University

Bachelor of Arts, English and Communication Studies

Bachelor of Arts
2019 - 2023

Work History

M

Multiple AI Companies

AI Trainer / LLM Evaluator (Remote, Contract)

Location not specified
2021 - Present