Heather Jackson - LLM Evaluation and Creative Prompt Engineering Specialist in English

Key Skills

Software

AWS SageMaker

Appen

Dataloop

Labelbox

Lionbridge

Remotasks

Scale AI

SuperAnnotate

Supervisely

Top Subject Matter

No subject matter listed

Top Data Types

Audio

Image

Text

Top Task Types

Evaluation Rating

Prompt Response Writing SFT

RLHF

Text Generation

Translation Localization

Freelancer Overview

I have over three years of experience contributing to AI and LLM training projects focused on language model evaluation, prompt engineering, and linguistic data annotation. At Scale AI, I designed and evaluated prompts for text generation and comprehension tasks—assessing model accuracy, coherence, and tone alignment. My work involved rating AI outputs, performing instruction-following evaluations, and ensuring data consistency across complex linguistic datasets. As a Content and Prompt Engineer with RWS Group, I created and tested diverse prompts and responses for conversational AI systems, labeling text data for intent, emotion, and stylistic tone. This included curating datasets for fine-tuning and assessing response quality metrics across multiple difficulty levels. Earlier, my editorial background at The Washington Post helped me refine text classification and content evaluation skills by applying linguistic precision and readability standards—a foundation that now informs my data annotation and LLM evaluation work. Overall, my AI training experience blends linguistic expertise, creativity, and analytical rigor to help develop more natural, context-aware, and high-performing language models.

ExpertFrenchEnglishSpanish

Labeling Experience

LLM Text Evaluation and Prompt Engineering Project

AppenTextClassificationQuestion Answering

As part of an ongoing LLM evaluation initiative with Scale AI, I worked on designing and labeling text-based datasets to improve the accuracy, coherence, and contextual reasoning of large language models. My tasks included writing and evaluating thousands of prompt–response pairs, classifying text by tone and intent, and assessing AI-generated outputs for factual consistency, creativity, and compliance with task instructions. I also participated in the quality assurance and scoring phase, where I rated model outputs against detailed rubrics and collaborated with cross-functional teams to identify linguistic biases and style inconsistencies. The project involved iterative testing cycles and required maintaining high annotation accuracy (>98%) according to internal benchmarks.

2022 - 2023

Education

B

Boston University

Master Of Arts, Linguistics And Communication

Master Of Arts

2016 - 2016

B

Boston University

Bachelor Of Arts, English And Journalism

Bachelor Of Arts

2013 - 2013

Work History

T

The Washington Post

Senior Copywriter & Editorial Analyst

Washington, D.C.

2016 - 2020

P

Penguin Random House

Editorial Assistant

New York

2013 - 2016