For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Martin Gould

Martin Gould

Expert in LLM Eval & Text Gen in English & Chinese

United Kingdom flaglondon, United Kingdom
$40.00/hrExpertAppenData Annotation TechRemotasks

Key Skills

Software

AppenAppen
Data Annotation TechData Annotation Tech
RemotasksRemotasks
Scale AIScale AI
TolokaToloka
LabelboxLabelbox

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code ProgrammingComputer Code Programming
ImageImage
TextText

Top Task Types

Computer Programming Coding
Evaluation Rating
Prompt Response Writing SFT
RLHF
Translation Localization

Freelancer Overview

As an AI Data & QA Specialist with 6+ years of experience in annotation, LLM behavior testing, and scalable data operations, I help AI teams build smarter, safer, and more reliable models.I've worked on projects ranging from multimodal dataset creation (text, audio, image, and video) to prompt engineering and A/B testing for top AI companies, including Goodnotes, Focal Systems, Gatik, and e2f. Whether designing structured evaluation prompts or optimizing annotation workflows, I bring deep precision and cross functional collaboration to every phase of model development. My expertise lies in creating high-quality training and evaluation datasets that consistently deliver 98 %+ accuracy, directly improving downstream model performance. I'm fluent in tools like Labelbox, Tagtog, and internal annotation platforms, and comfortable adapting to fast-paced AI research environments.

ExpertSwahiliEnglishChinese Mandarin

Labeling Experience

Labelbox

LLM Evaluation and Text Generation for Multilingual Chatbots

LabelboxTextClassificationText Generation
I worked on a large-scale LLM evaluation and text generation project for a leading chatbot development company. The task involved evaluating the performance of their multilingual LLMs on various customer service scenarios, generating high-quality responses in English and Chinese, and annotating entities using NER classification. The project scope included 10,000+ text samples, with a focus on achieving high accuracy and consistency. I adhered to strict quality measures, including inter-annotator agreement (IAA) checks and regular feedback loops with the project team.

I worked on a large-scale LLM evaluation and text generation project for a leading chatbot development company. The task involved evaluating the performance of their multilingual LLMs on various customer service scenarios, generating high-quality responses in English and Chinese, and annotating entities using NER classification. The project scope included 10,000+ text samples, with a focus on achieving high accuracy and consistency. I adhered to strict quality measures, including inter-annotator agreement (IAA) checks and regular feedback loops with the project team.

2024 - 2025

Education

A

Athena Global Education

Bachelor's Degree, Data Analytics

Bachelor's Degree
2022 - 2023
R

Rongo University ·

masters degree in computer science, Bachelor of Arts - BA, Sociology, Criminology and Community Development

masters degree in computer science
2013 - 2017

Work History

G

Goodnotes

AI Data Specialist

london
2024 - Present
F

Focal Systems

Data Labeler (Reviewer)

Burlingame
2021 - 2023