Charlotte Somerville - LLM Evaluation & Prompt Engineering Specialist across complex tasks

Key Skills

Software

Data Annotation Tech

Remotasks

Scale AI

Telus

Top Subject Matter

No subject matter listed

Top Data Types

Image

Text

Video

Top Task Types

Emotion Recognition

Evaluation Rating

Fine Tuning

Prompt Response Writing SFT

Question Answering

Freelancer Overview

I have extensive experience in data labeling and AI training across text, image, and video, with a strong focus on evaluation and prompt engineering for large language models. My work spans factuality assessment, instruction following, long-context reasoning, and nuanced dialogue evaluation. I have contributed to projects involving rubric design, multi-turn conversation testing, behavioural instruction development, and image-inclusive tasks that require cross-modal reasoning. My expertise also includes emotion recognition, fact-checking, and educational/tutoring assistant evaluations, as well as creative writing and narrative generation tasks. This combination of analytical and creative skills allows me to assess both technical precision and natural language fluency. I bring a keen eye for subtle errors, strong writing skills, and the ability to provide constructive feedback that improves model outputs. My versatility across evaluation, prompt & response writing (SFT), fine-tuning support, and question answering makes me well-suited to a wide range of generalist AI training projects.

Entry LevelFrenchHebrewEnglish

Labeling Experience

Cross-Model Comparative LLM Evaluation

Data Annotation TechTextQuestion AnsweringEvaluation Rating

Compared outputs from multiple frontier LLMs on identical prompts. Rated responses for correctness, clarity, style, safety, and grounding. Highlighted strengths, weaknesses, and qualitative differences between models. Provided detailed feedback to support model benchmarking.

2025

Cross-Domain Evaluation – Creative & Educational Tasks

Data Annotation TechTextQuestion AnsweringEvaluation Rating

Evaluated model outputs across diverse domains including creative writing, educational tutoring, and structured reasoning tasks. Assessed for accuracy, creativity, clarity, and age-appropriate tone. Applied rubrics to measure instruction adherence and engagement.

2025

Multi-Turn Conversation Evaluation – Complex Reasoning

Data Annotation TechTextQuestion AnsweringEmotion Recognition

Evaluated multi-turn conversations for depth of reasoning, factuality, and emotional appropriateness. Rated outputs for instruction-following, tone alignment, and clarity. Provided structured feedback on conversation coherence and engagement quality.

2025

Behavioural Instruction & System Prompt Evaluation

Data Annotation TechTextFine TuningEvaluation Rating

Created and evaluated behavioural instruction sets for system prompts. Tested model adherence to constraints, tone, and safety rules across multi-turn conversations. Compared outputs against expected behaviours to refine instruction effectiveness.

2025

Factuality Evaluation – LLM Claim Verification

Data Annotation TechTextQuestion AnsweringEvaluation Rating

Conducted detailed factuality assessments of model outputs against reference documents. Identified inaccuracies, graded severity, and provided explanations of truthfulness. Focused on grounding responses in given sources and measuring distance from factual truth.

2025

Education

U

University of the Arts, London

Bachelor of Arts (Honours), Photojournalism & Documentary Photography

Bachelor of Arts (Honours)

2013 - 2016

T

Truro School

A Levels, A Levels in Philosophy & Ethics, Biology and Chemistry

A Levels

2010 - 2012

Work History

D

DataAnnotation.Tech

Data Annotator

Kings Lynn

2025 - Present

C

Costa Coffee

Senior Barista Maestro

Kings Lynn

2022 - 2025