Doreen Wambui - AI Prompt & LLM Evaluation Specialist - AI Quality & Safety

Key Skills

Software

Scale AI

Telus

Top Subject Matter

No subject matter listed

Top Data Types

Image

Text

Top Label Types

Bounding Box

Polygon

Classification

RLHF

Evaluation Rating

Prompt Response Writing SFT

Freelancer Overview

I specialize in AI prompt evaluation, LLM response assessment, and data annotation, with hands-on experience ensuring the quality, relevance, and safety of AI-generated outputs. My work involves ranking and reviewing large volumes of model responses using platforms like ChatGPT, Remotasks, UHRS, Toloka, and Surge AI, always adhering to rigorous guidelines and quality standards. I’m skilled at detecting bias, misinformation, and policy violations, and I contribute structured reinforcement feedback to help align AI models with human expectations. My background includes supporting AI safety initiatives and participating in complex evaluation workflows that demand strong analytical judgment and attention to detail, particularly in natural language processing domains.

IntermediateSwahiliEnglish

Labeling Experience

Freelance AI Projects

TelusTextPrompt Response Writing SFT

Contributed to AI prompt engineering initiatives to improve model adaptability. Performed independent RLHF evaluations to optimize conversational AI response quality. Executed multilingual annotation tasks enhancing AI global user experience. Promoted to Quality Reviewer and AI Feedback Specialist in recognition of performance excellence and leadership. • Improved model safety and alignment. • Enhanced AI performance relative to user expectations. • Supported edge case and ethical compliance reviews. • Demonstrated strong analytical and critical thinking skills.

2025

AI Model Training & Data Annotation

Scale AIImageRLHF

Conducted large-scale annotation of text, image, and video datasets to train machine learning models for accuracy and contextual awareness. Evaluated AI-generated outputs, ranked their quality, and provided reinforcement feedback to help align models with human expectations. Identified harmful, biased, or unsafe outputs and flagged them for AI safety mitigation. Collaborated on high-judgment AI quality tasks involving nuanced decision-making, ambiguity resolution, and user-intent prediction. • Executed multilingual content evaluation tasks. • Improved AI model accuracy by 23%. • Annotated over 12,000 samples with 98.7% accuracy. • Reduced unsafe AI outputs by 18%.

2022

LLM Prompt Evaluation & Response Rating

TelusTextClassificationRLHF

I worked on language model evaluation tasks at Telus focused on improving the quality of AI-generated text. The project involved reviewing and rating AI responses to prompts based on predefined guidelines, including accuracy, relevance, completeness, tone, and instruction adherence. I compared multiple model outputs and selected the best response while identifying errors, ambiguities, or policy issues. The work required strong attention to detail, consistency in applying evaluation criteria, and clear judgment when handling edge cases. I followed strict quality standards to ensure reliable feedback for model improvement, contributing to higher-quality training data for large language models.

2025 - 2025

Autonomous Driving Image Annotation

Scale AIImageBounding BoxPolygon

I worked on large-scale computer vision data annotation projects for autonomous driving use cases at Scale AI. The project involved annotating road scene imagery to support perception models for self-driving systems. My tasks included accurate object labeling using bounding boxes, polygons, and cuboids for vehicles, pedestrians, cyclists, traffic signs, and other roadway elements. I followed strict annotation guidelines and quality standards, ensuring consistency across frames and edge cases such as occlusions, overlaps, and varying lighting conditions. I also participated in quality review processes, correcting annotation errors and validating outputs before submission. This work required high attention to detail, consistency, and the ability to interpret complex visual scenes to produce reliable training data for machine learning models.

2022 - 2024

Education

N

NORAI

Certificate, Prompt Engineering

Certificate

2025 - 2025

K

Kiriri Women's University Of Science and Technology

Diploma in Human Resource Managment, Human Resource Managment

Diploma in Human Resource Managment

2022 - 2024

Work History

V

Various Settings

Customer Interaction & Support Experience

Nairobi

2022 - 2025

N

Nairobi City County

Administrative & Office Support

Nairobi

2023 - 2023