Bonface Mosabi - LLM Evaluation, Prompt Engineering and Data Annotation.

Key Skills

Software

Appen

Remotasks

Top Subject Matter

Large Language Models (LLMs)

General AI

Data Annotation

Top Data Types

Text

Document

Video

Top Task Types

Prompt Response Writing SFT

Classification

Bounding Box

Segmentation

Text Generation

Question Answering

RLHF

Red Teaming

Fine Tuning

Text Summarization

Freelancer Overview

I'm happy to work on projects involving AI model evaluation, data annotation, prompt engineering, and quality assurance across diverse subject areas. With 4 years of experience training and evaluating LLMs at platforms like Appen and Remotask, I've developed the ability to adapt quickly to new domains, from technical and analytical tasks to creative and language-based projects. I'm particularly interested in projects that challenge me to apply critical thinking, problem-solving, and attention to detail in new and unfamiliar contexts. I welcome the opportunity to contribute across multiple disciplines, whether it involves content evaluation, research tasks, data interpretation, or testing AI systems for accuracy and safety. My goal is to bring my versatile skill set to any project that demands quality, adaptability, and a willingness to learn.

ExpertSwahiliEnglish

Labeling Experience

LLM Evaluation | Appen

AppenTextRLHF

I evaluated large language model (LLM) outputs for accuracy, relevance, and adherence to guidelines. My work included conducting comparative assessments and providing structured feedback to enhance LLM performance. This contributed significantly to fine-tuning and improving AI model reliability. • Conducted LLM response evaluations for various prompt types • Provided rationale and ranking for model outputs • Collaborated across global evaluation teams • Helped improve response accuracy by 20%

2025 - 2026

AI Prompt Engineer | Appen

AppenTextPrompt Response Writing SFT

I engineered and tested prompts to evaluate and stretch the limits of AI language models. Scenario development and adversarial inputs were core aspects of my workflow. My work resulted in more robust and safer AI models across real-world applications. • Created complex and adversarial prompts for LLMs • Designed real-world tasks to challenge model logic • Contributed to prompt engineering guidelines • Achieved a 20% improvement in model safety

2024 - 2025

Data Annotation | Remotask

RemotasksImageBounding Box

I performed high-accuracy data annotation and labeling on texts, images, videos and 3d scenes for AI training. Ensuring data quality, reliability, and compliance with benchmarking guidelines was my primary responsibility. My input aided in developing structured datasets critical to AI evaluation and model development. • Labeled and classified text data for model training • Maintained 100% accuracy over multiple projects • Refined annotation guidelines and rubrics • Consistently achieved top quality scores

2022 - 2024

Education

M

Mount Kenya University

Bachelor of Science, Computer Science

Bachelor of Science

2021 - 2025

Work History

O

Oneforma

Translation and Localization

Nairobi

2025 - 2025