For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
M
Moti

Moti

AI Content Evaluation

Ethiopia flagRemote, Ethiopia
$40.00/hrExpertOtherLabelboxOpencv AI Kit Oak

Key Skills

Software

Other
LabelboxLabelbox
OpenCV AI Kit (OAK)OpenCV AI Kit (OAK)
ProdigyProdigy
SamaSama
Internal/Proprietary Tooling
Data Annotation TechData Annotation Tech

Top Subject Matter

Generative AI
Coding Domain Expertise
Reasoning Domain Expertise

Top Data Types

TextText
DocumentDocument
Computer Code ProgrammingComputer Code Programming

Top Task Types

RLHFRLHF
ClassificationClassification
Text GenerationText Generation
Fine-tuningFine-tuning
Red TeamingRed Teaming
Evaluation/RatingEvaluation/Rating
Computer Programming/CodingComputer Programming/Coding
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)

Freelancer Overview

LLM Training Specialist. Brings 5+ years of professional experience across RLHF workflows, AI content evaluation, and quality-focused execution for production LLM systems, including an onsite AI Engineer internship in Tokyo, Japan (AOBA-BBT, METI AI/Tech Talent Program). Core strengths include Dataset Curation, Prompt Design, and Evaluation Benchmark Creation. Education includes Bachelor of Science in Software Engineering, Adama Science and Technology University (2025). AI-training focus includes data types such as Computer Code, Reasoning, and Multilingual Text, and labeling workflows including RLHF, Comparative Ranking, and Structured Error Analysis.

ExpertEnglishAmharic

Labeling Experience

AI Trainer & Model Evaluation Specialist

OtherTextRLHF
As an AI Trainer & Model Evaluation Specialist, I contributed to RLHF workflows for production LLM systems. My tasks included dataset curation, prompt design, and creation of evaluation benchmarks, ranking model outputs for diverse tasks while flagging issues in reasoning, bias, and safety. I also performed structured error analysis, producing feedback utilized in continuous model fine-tuning cycles. • Evaluated and ranked model responses across coding, reasoning, and instruction-following • Conducted comprehensive error analyses and flagged output issues • Created and improved evaluation benchmarks • Maintained strong quality scores and supported model alignment

As an AI Trainer & Model Evaluation Specialist, I contributed to RLHF workflows for production LLM systems. My tasks included dataset curation, prompt design, and creation of evaluation benchmarks, ranking model outputs for diverse tasks while flagging issues in reasoning, bias, and safety. I also performed structured error analysis, producing feedback utilized in continuous model fine-tuning cycles. • Evaluated and ranked model responses across coding, reasoning, and instruction-following • Conducted comprehensive error analyses and flagged output issues • Created and improved evaluation benchmarks • Maintained strong quality scores and supported model alignment

2025 - 2026

AI Content Evaluation

OtherText
I evaluated AI-generated responses for generative AI systems by applying detailed quality rubrics covering accuracy, relevance, safety, clarity, and instruction-following. My work involved comparative ranking of AI model outputs and diligent identification of issues such as hallucinations, factual inconsistencies, bias, and toxicity. This supported dataset quality benchmarking efforts for AI labs across content domains like coding, multilingual tasks, and reasoning. • Performed multi-dimensional assessment on text outputs • Flagged reasoning failures and safety violations • Ranked outputs for overall quality and consistency • Maintained consistent quality ratings while handling substantial evaluation volume

I evaluated AI-generated responses for generative AI systems by applying detailed quality rubrics covering accuracy, relevance, safety, clarity, and instruction-following. My work involved comparative ranking of AI model outputs and diligent identification of issues such as hallucinations, factual inconsistencies, bias, and toxicity. This supported dataset quality benchmarking efforts for AI labs across content domains like coding, multilingual tasks, and reasoning. • Performed multi-dimensional assessment on text outputs • Flagged reasoning failures and safety violations • Ranked outputs for overall quality and consistency • Maintained consistent quality ratings while handling substantial evaluation volume

2024 - 2025

Education

A

Adama Science and Technology University

Bachelor of Science, Software Engineering

Bachelor of Science
2021 - 2025

Work History

U

Upwork

Freelance Software Engineer

Remote
2023 - Present
Q

Quote.Vote

Team Lead, Code Reviewer & Developer

Remote
2022 - Present