For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
S

Seth Otieno

AI Model Evaluation & Data Annotation Specialist

Kenya flagNairobi, Kenya
$10.00/hrExpertLabelboxLabel StudioCVAT

Key Skills

Software

LabelboxLabelbox
Label StudioLabel Studio
CVATCVAT

Top Subject Matter

Large Language Models
Conversational AI
Nlp Domain Expertise

Top Data Types

ImageImage
VideoVideo
Computer Code ProgrammingComputer Code Programming

Top Task Types

ClassificationClassification
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)
Computer Programming/CodingComputer Programming/Coding
TranscriptionTranscription

Freelancer Overview

I am an AI Model Evaluation and Data Annotation specialist with a background in computer science and software engineering. I have experience reviewing and labeling datasets used to train large language models (LLMs), including reinforcement learning from human feedback (RLHF), comparative response ranking, and structured evaluation of AI outputs. My work involves identifying hallucinations, bias, and logical inconsistencies while ensuring alignment with safety and policy standards. I have also designed and tested prompts to analyze model behavior and improve response quality across diverse scenarios. In addition to AI training work, I bring a solid technical foundation in Python, JavaScript, and machine learning tools such as Pandas and Scikit-learn. I have worked remotely on AI evaluation and content moderation projects, contributing high-quality annotations and detailed reasoning to help improve model performance and reliability. My combination of technical expertise, analytical thinking, and experience working with AI training pipelines allows me to deliver accurate, consistent, and scalable contributions to AI development projects.

ExpertEnglishSwahili

Labeling Experience

AI Content Reviewer & Model Trainer

OtherTextClassification
Reviewed and validated AI-generated content for accuracy and policy compliance. Labeled datasets for training moderation and classification models, flagging unsafe or non-compliant outputs. Provided structured feedback improving model alignment and ethical standards. • Contributed to safer AI deployment with actionable feedback. • Supported model alignment with human expectations. • Labeled and classified text content for moderation. • Ensured compliance with safety guidelines.

Reviewed and validated AI-generated content for accuracy and policy compliance. Labeled datasets for training moderation and classification models, flagging unsafe or non-compliant outputs. Provided structured feedback improving model alignment and ethical standards. • Contributed to safer AI deployment with actionable feedback. • Supported model alignment with human expectations. • Labeled and classified text content for moderation. • Ensured compliance with safety guidelines.

2025 - 2026

Prompt Engineer & AI Systems Tester

OtherTextPrompt Response Writing SFT
Engineered structured prompts for conversational AI and iteratively tested prompt-response performance. Simulated real-world interactions to stress-test AI systems and documented response patterns. Collaborated remotely to enhance reliability and usability of AI models. • Refined prompts based on behavioral analysis. • Documented and analyzed model responses across multiple domains. • Used collaborative tools for team communication. • Improved system robustness through stress testing.

Engineered structured prompts for conversational AI and iteratively tested prompt-response performance. Simulated real-world interactions to stress-test AI systems and documented response patterns. Collaborated remotely to enhance reliability and usability of AI models. • Refined prompts based on behavioral analysis. • Documented and analyzed model responses across multiple domains. • Used collaborative tools for team communication. • Improved system robustness through stress testing.

2024 - 2025

AI Model Evaluation & Data Annotation Specialist

OtherText
Evaluated outputs from large language models (LLMs) for correctness, coherence, and alignment, using structured evaluation frameworks. Annotated high-volume datasets for supervised and reinforcement learning pipelines (RLHF) and performed comparative ranking of model outputs. Identified and documented failure patterns, applied NLP concepts, and maintained high accuracy under strict guidelines. • Applied semantic similarity, intent classification, and contextual relevance during evaluations. • Designed and tested prompts to improve model robustness. • Provided detailed reasoning to guide model optimization. • Adhered to quality assurance standards and deadlines.

Evaluated outputs from large language models (LLMs) for correctness, coherence, and alignment, using structured evaluation frameworks. Annotated high-volume datasets for supervised and reinforcement learning pipelines (RLHF) and performed comparative ranking of model outputs. Identified and documented failure patterns, applied NLP concepts, and maintained high accuracy under strict guidelines. • Applied semantic similarity, intent classification, and contextual relevance during evaluations. • Designed and tested prompts to improve model robustness. • Provided detailed reasoning to guide model optimization. • Adhered to quality assurance standards and deadlines.

2021 - 2024

Education

K

Kenyatta University

Bachelor of Science, Computer Science

Bachelor of Science
2019 - 2023

Work History

Z

Zeraki Kenya

Prompt Engineer & AI Systems Tester

Nairobi
2024 - 2025
U

Upwork

AI Model Evaluation & Data Annotation Specialist.

Nairobi
2021 - 2024