For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Kelvin Mungai

Kelvin Mungai

AI Model Evaluator - Large Language Models & Agent Systems

KENYA flag
Nairobi, Kenya
$20.00/hrExpertScale AILabelboxOther

Key Skills

Software

Scale AIScale AI
LabelboxLabelbox
Other

Top Subject Matter

No subject matter listed

Top Data Types

ImageImage
Computer Code ProgrammingComputer Code Programming
VideoVideo
AudioAudio

Top Label Types

Segmentation
Entity Ner Classification
Point Key Point

Freelancer Overview

I am an experienced AI evaluation specialist with a strong background in data labeling, annotation, and model performance testing for large language models and autonomous agent systems. My expertise includes rubric-based scoring, structured qualitative feedback, and benchmarking methodologies to ensure accuracy, safety, and consistency in AI outputs. I have hands-on experience evaluating LLM responses for instruction adherence, factual consistency, and reasoning coherence, as well as identifying error patterns, hallucinations, and failure modes. My technical foundation in Python, SQL, and machine learning, combined with my ability to collaborate with cross-functional teams, enables me to contribute effectively to the continuous improvement of AI training data and evaluation standards.

ExpertEnglishSwahili

Labeling Experience

Video Annotation

OtherVideoPoint Key PointSegmentation
I have done video annotation on Atlas Capture contributing to the developmetn of AI and machine learning toos for computer vision applications. Responsiblities included applying classification labels to ensure consistency in video sequences. Tasks involved annotating actions of Ego ( human) view when handling tasks.

I have done video annotation on Atlas Capture contributing to the developmetn of AI and machine learning toos for computer vision applications. Responsiblities included applying classification labels to ensure consistency in video sequences. Tasks involved annotating actions of Ego ( human) view when handling tasks.

2025
Labelbox

Data Labeling

LabelboxComputer Code ProgrammingEntity Ner Classification
Performed code classification nad labeling tasks on the label box platform as part of an AI alignemtn and model training projects. Responsiblities included reviewing and categorizing codes snippets across mutiple programming languages, evaluating code quality, correctness anfd functionality. Work contributed to training LLms to understand better and generate accurate code outputs.

Performed code classification nad labeling tasks on the label box platform as part of an AI alignemtn and model training projects. Responsiblities included reviewing and categorizing codes snippets across mutiple programming languages, evaluating code quality, correctness anfd functionality. Work contributed to training LLms to understand better and generate accurate code outputs.

2022 - 2025
Scale AI

Labeling

Scale AIAudioEntity Ner Classification
I perfomed audio quality and labeling tasks on the Outlier platfrom, contributing to the improvement of AI speech and audio recogantion models. Responsibilities included reviewing and assessing audio recordings submitted by other contributors evaluating them based on key quality parameters such as clarity, background noise, pronunciaton accuracy, naturalness of the speech, accent origin and overall recording quality. I maintained high accuracy and consistency throught the project contributing to cleaner and more reliable training datasets for AI audio and speech models.

I perfomed audio quality and labeling tasks on the Outlier platfrom, contributing to the improvement of AI speech and audio recogantion models. Responsibilities included reviewing and assessing audio recordings submitted by other contributors evaluating them based on key quality parameters such as clarity, background noise, pronunciaton accuracy, naturalness of the speech, accent origin and overall recording quality. I maintained high accuracy and consistency throught the project contributing to cleaner and more reliable training datasets for AI audio and speech models.

2022 - 2024
Scale AI

Data Labeler

Scale AIImageSegmentation
Completed image segmentation on the Remotasks platfrom, contributing to the development and training of AI and machine learning models. Work involved precisely outlinig and labeling objects within images using polygon and brush tools to create pixels accurate segmaentation masks.

Completed image segmentation on the Remotasks platfrom, contributing to the development and training of AI and machine learning models. Work involved precisely outlinig and labeling objects within images using polygon and brush tools to create pixels accurate segmaentation masks.

2021 - 2022

Education

U

United States International University

Bachelor of Science, Computer Science

Bachelor of Science
2017 - 2020

Work History

K

KCB PLC

Junior Software Develeper

Nairobi
2022 - 2024
K

KCB PLC

Junior Software Develeper

Nairobi
2022 - 2024