For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Bonface Mosabi

Bonface Mosabi

LLM Evaluation, Prompt Engineering and Data Annotation.

KENYA flag
Nairobi, Kenya
$30.00/hrExpertAppenRemotasks

Key Skills

Software

AppenAppen
RemotasksRemotasks

Top Subject Matter

Large Language Models (LLMs)
General AI
Data Annotation

Top Data Types

TextText
DocumentDocument
VideoVideo

Top Task Types

Prompt Response Writing SFT
Classification
Bounding Box
Segmentation
Text Generation
Question Answering
RLHF
Red Teaming
Fine Tuning
Text Summarization

Freelancer Overview

I'm happy to work on projects involving AI model evaluation, data annotation, prompt engineering, and quality assurance across diverse subject areas. With 4 years of experience training and evaluating LLMs at platforms like Appen and Remotask, I've developed the ability to adapt quickly to new domains, from technical and analytical tasks to creative and language-based projects. I'm particularly interested in projects that challenge me to apply critical thinking, problem-solving, and attention to detail in new and unfamiliar contexts. I welcome the opportunity to contribute across multiple disciplines, whether it involves content evaluation, research tasks, data interpretation, or testing AI systems for accuracy and safety. My goal is to bring my versatile skill set to any project that demands quality, adaptability, and a willingness to learn.

ExpertSwahiliEnglish

Labeling Experience

Appen

LLM Evaluation | Appen

AppenTextRLHF
I evaluated large language model (LLM) outputs for accuracy, relevance, and adherence to guidelines. My work included conducting comparative assessments and providing structured feedback to enhance LLM performance. This contributed significantly to fine-tuning and improving AI model reliability. • Conducted LLM response evaluations for various prompt types • Provided rationale and ranking for model outputs • Collaborated across global evaluation teams • Helped improve response accuracy by 20%

I evaluated large language model (LLM) outputs for accuracy, relevance, and adherence to guidelines. My work included conducting comparative assessments and providing structured feedback to enhance LLM performance. This contributed significantly to fine-tuning and improving AI model reliability. • Conducted LLM response evaluations for various prompt types • Provided rationale and ranking for model outputs • Collaborated across global evaluation teams • Helped improve response accuracy by 20%

2025 - 2026
Appen

AI Prompt Engineer | Appen

AppenTextPrompt Response Writing SFT
I engineered and tested prompts to evaluate and stretch the limits of AI language models. Scenario development and adversarial inputs were core aspects of my workflow. My work resulted in more robust and safer AI models across real-world applications. • Created complex and adversarial prompts for LLMs • Designed real-world tasks to challenge model logic • Contributed to prompt engineering guidelines • Achieved a 20% improvement in model safety

I engineered and tested prompts to evaluate and stretch the limits of AI language models. Scenario development and adversarial inputs were core aspects of my workflow. My work resulted in more robust and safer AI models across real-world applications. • Created complex and adversarial prompts for LLMs • Designed real-world tasks to challenge model logic • Contributed to prompt engineering guidelines • Achieved a 20% improvement in model safety

2024 - 2025
Remotasks

Data Annotation | Remotask

RemotasksImageBounding Box
I performed high-accuracy data annotation and labeling on texts, images, videos and 3d scenes for AI training. Ensuring data quality, reliability, and compliance with benchmarking guidelines was my primary responsibility. My input aided in developing structured datasets critical to AI evaluation and model development. • Labeled and classified text data for model training • Maintained 100% accuracy over multiple projects • Refined annotation guidelines and rubrics • Consistently achieved top quality scores

I performed high-accuracy data annotation and labeling on texts, images, videos and 3d scenes for AI training. Ensuring data quality, reliability, and compliance with benchmarking guidelines was my primary responsibility. My input aided in developing structured datasets critical to AI evaluation and model development. • Labeled and classified text data for model training • Maintained 100% accuracy over multiple projects • Refined annotation guidelines and rubrics • Consistently achieved top quality scores

2022 - 2024

Education

M

Mount Kenya University

Bachelor of Science, Computer Science

Bachelor of Science
2021 - 2025

Work History

O

Oneforma

Translation and Localization

Nairobi
2025 - 2025