Kelvin Mungai - AI Model Evaluator - Large Language Models & Agent Systems

Key Skills

Software

Scale AI

Labelbox

Other

Top Subject Matter

No subject matter listed

Top Data Types

Image

Computer Code Programming

Video

Audio

Top Label Types

Segmentation

Entity Ner Classification

Point Key Point

Freelancer Overview

I am an experienced AI evaluation specialist with a strong background in data labeling, annotation, and model performance testing for large language models and autonomous agent systems. My expertise includes rubric-based scoring, structured qualitative feedback, and benchmarking methodologies to ensure accuracy, safety, and consistency in AI outputs. I have hands-on experience evaluating LLM responses for instruction adherence, factual consistency, and reasoning coherence, as well as identifying error patterns, hallucinations, and failure modes. My technical foundation in Python, SQL, and machine learning, combined with my ability to collaborate with cross-functional teams, enables me to contribute effectively to the continuous improvement of AI training data and evaluation standards.

ExpertEnglishSwahili

Labeling Experience

Video Annotation

OtherVideoPoint Key PointSegmentation

I have done video annotation on Atlas Capture contributing to the developmetn of AI and machine learning toos for computer vision applications. Responsiblities included applying classification labels to ensure consistency in video sequences. Tasks involved annotating actions of Ego ( human) view when handling tasks.

2025

Data Labeling

LabelboxComputer Code ProgrammingEntity Ner Classification

Performed code classification nad labeling tasks on the label box platform as part of an AI alignemtn and model training projects. Responsiblities included reviewing and categorizing codes snippets across mutiple programming languages, evaluating code quality, correctness anfd functionality. Work contributed to training LLms to understand better and generate accurate code outputs.

2022 - 2025

Labeling

Scale AIAudioEntity Ner Classification

I perfomed audio quality and labeling tasks on the Outlier platfrom, contributing to the improvement of AI speech and audio recogantion models. Responsibilities included reviewing and assessing audio recordings submitted by other contributors evaluating them based on key quality parameters such as clarity, background noise, pronunciaton accuracy, naturalness of the speech, accent origin and overall recording quality. I maintained high accuracy and consistency throught the project contributing to cleaner and more reliable training datasets for AI audio and speech models.

2022 - 2024

Data Labeler

Scale AIImageSegmentation

Completed image segmentation on the Remotasks platfrom, contributing to the development and training of AI and machine learning models. Work involved precisely outlinig and labeling objects within images using polygon and brush tools to create pixels accurate segmaentation masks.

2021 - 2022

Education

U

United States International University

Bachelor of Science, Computer Science

Bachelor of Science

2017 - 2020

Work History

K

KCB PLC

Junior Software Develeper

Nairobi

2022 - 2024

K

KCB PLC

Junior Software Develeper

Nairobi

2022 - 2024