Barn Omondi - AI Training Data Specialist | Data Labeling, QA, Python

Key Skills

Software

CloudFactory

Data Annotation Tech

Google Cloud Vertex AI

Mercor

Mindrift

Remotasks

Scale AI

SuperAnnotate

Telus

Top Subject Matter

No subject matter listed

Top Data Types

Audio

Computer Code Programming

Image

Text

Video

Top Label Types

Audio Recording

Bounding Box

Classification

Evaluation Rating

Fine Tuning

Object Detection

Prompt Response Writing SFT

Red Teaming

RLHF

Transcription

Translation Localization

Freelancer Overview

I am an experienced AI trainer and STEM specialist with a strong background in data labeling, annotation, and quality assurance for AI and machine learning projects. My work spans high-impact domains such as STEM education, medical transcription, and large language model training, including contributions to projects with Invisible Technologies, Google (Gemini), and OpenAI. I excel in rubric-based evaluation, RLHF, and side-by-side model assessments, consistently delivering accurate, high-quality datasets through meticulous annotation and QA processes. My technical toolkit includes Python for automation and data validation, as well as experience with proprietary labeling tools and platforms. I am passionate about improving model reasoning, factual grounding, and overall AI performance through precise data curation and collaborative problem-solving.

ExpertEnglishGermanSpanishArabic

Labeling Experience

RLHF & Side-by-Side Preference Labeling (LLM Training Data)

Scale AITextRLHFEvaluation Rating

Labeled and evaluated LLM outputs using rubric-based scoring and preference ranking (SxS) to generate high-quality RLHF training data. Tasks included rating helpfulness, correctness, reasoning quality, instruction adherence, and safety/format compliance; tagging error types (hallucination, missing constraints, math/logic issues, unsupported claims); and producing concise, consistent rationales aligned to guidelines. Maintained quality through calibration with benchmarks, spot-checking difficult edge cases, and applying consistent decision rules to reduce annotator drift.

2024 - 2025

Medical Transcription Annotation & QA (Clinical Speech-to-Text)

MercorAudioEvaluation RatingTranscription

Performed medical transcription labeling and QA to improve speech-to-text accuracy and downstream documentation quality. Work included verbatim transcription, speaker/turn handling, medical terminology normalization, and structured corrections aligned to style rules. Quality measures included systematic self-checking for dosage/units, names, abbreviations, and high-risk clinical terms; consistency checks across repeated templates; and escalation of ambiguous audio segments using standardized flags.

2023 - 2024

Education

U

University of Melbourne

Master of Science, Computer Science

Master of Science

2023 - 2025

J

JKUAT

Master of Science, Analytical Chemistry

Master of Science

2020 - 2022

Work History

S

SICHEM LLC

Chemistry Lab Technician

Abu Dhabi

2022 - 2022

J

JKUAT

Teaching Assistant

Nairobi

2020 - 2021