For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Daeshauwna Jones

Daeshauwna Jones

Multilingual LLM Evaluation Specialist | NLP & Data Labeling Expert

USA flagSpring, Usa
$10.00/hrExpertAws SagemakerAppenClickworker

Key Skills

Software

AWS SageMakerAWS SageMaker
AppenAppen
ClickworkerClickworker
CloudFactoryCloudFactory
CrowdFlowerCrowdFlower
Data Annotation TechData Annotation Tech
Figure EightFigure Eight
LabelboxLabelbox
Label StudioLabel Studio
LionbridgeLionbridge
MindriftMindrift
RemotasksRemotasks
Scale AIScale AI
Snorkel AISnorkel AI
SuperAnnotateSuperAnnotate
TolokaToloka
TelusTelus
Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
Medical DicomMedical Dicom
TextText

Top Task Types

Bounding Box
Classification
Evaluation Rating
Prompt Response Writing SFT
Question Answering

Freelancer Overview

I'm a seasoned AI Data Labeling and NLP Specialist with over 7 years of experience supporting top-tier clients through platforms like Upwork, Appen, Toloka, and Scale AI. I’ve contributed to diverse projects spanning LLM evaluation, multilingual named entity recognition (NER), image segmentation for medical diagnostics, sentiment classification, and prompt engineering for fine-tuning AI systems. My work stands out for its high accuracy, fast turnaround, and strong quality assurance scores. I specialize in evaluating large language model outputs (LLMs) for safety, relevance, and factual integrity, as well as crafting complex prompt-response pairs across multiple languages. I’m also skilled in tools like Label Studio, CVAT, and AWS SageMaker. With a background in linguistics and a sharp eye for detail, I bring consistency, adaptability, and insight to every project I take on.

ExpertEnglish

Labeling Experience

Scale AI

LLM Prompt Evaluation & Safety Rating

Scale AITextClassificationQuestion Answering
Evaluated thousands of AI-generated responses for safety, helpfulness, coherence, tone, and factual accuracy using proprietary guidelines. Tasks included rating LLM outputs, identifying harmful or biased content, performing red teaming, and writing prompt-response pairs for fine-tuning. This project supported multilingual evaluation (English, Spanish) and required attention to linguistic nuance and ethical alignment. Delivered consistent, high-quality feedback with a QA approval rate exceeding 95%. Labeled over 500 hours of training and validation data in collaboration with global AI teams.

Evaluated thousands of AI-generated responses for safety, helpfulness, coherence, tone, and factual accuracy using proprietary guidelines. Tasks included rating LLM outputs, identifying harmful or biased content, performing red teaming, and writing prompt-response pairs for fine-tuning. This project supported multilingual evaluation (English, Spanish) and required attention to linguistic nuance and ethical alignment. Delivered consistent, high-quality feedback with a QA approval rate exceeding 95%. Labeled over 500 hours of training and validation data in collaboration with global AI teams.

2022 - 2023
Scale AI

Medical Image Segmentation & Classification for Diagnostic AI

Scale AIImageBounding BoxSegmentation
Performed pixel-level image segmentation and classification of radiology scans including X-rays and ultrasound images. Used CVAT and Scale AI platforms to label anatomical structures, highlight regions of interest, and identify potential abnormalities (e.g., lesions, fluid buildup). Collaborated closely with QA reviewers to ensure high clinical accuracy and consistency. The dataset was used to train and validate early-stage diagnostic AI models aimed at supporting radiologists and improving early disease detection. Maintained 97%+ accuracy across more than 20,000 labeled images.

Performed pixel-level image segmentation and classification of radiology scans including X-rays and ultrasound images. Used CVAT and Scale AI platforms to label anatomical structures, highlight regions of interest, and identify potential abnormalities (e.g., lesions, fluid buildup). Collaborated closely with QA reviewers to ensure high clinical accuracy and consistency. The dataset was used to train and validate early-stage diagnostic AI models aimed at supporting radiologists and improving early disease detection. Maintained 97%+ accuracy across more than 20,000 labeled images.

2019 - 2021

Education

C

Coursera (Stanford/DeepLearning.AI)

Online Certificate, Artificial Intelligence and Machine Learning

Online Certificate
2022 - 2022
U

University of Houston

Bachelor of Arts, Linguistics

Bachelor of Arts
2014 - 2018

Work History

F

Freelance via Upwork

AI Data Labeling Specialist

Spring
2020 - 2025
A

Appe

Data Annotation Contractor

Spring
2018 - 2022