Niall Gorton - Data Annotator and AI Content Evaluator

Key Skills

Software

Data Annotation Tech

Top Subject Matter

No subject matter listed

Top Data Types

Text

Image

Audio

Document

Geospatial Tiled Imagery

Top Label Types

Question Answering

Evaluation Rating

Data Collection

Audio Recording

Transcription

Text Generation

Text Summarization

Freelancer Overview

For most of the projects I've worked on as a Data Annotator and AI Content Evaluator, I reviewed and assessed AI-generated responses for relevance, accuracy, and overall quality, whilst identifying and flagging formatting issues, inconsistencies, biases, and factual errors. With these projects, I'd also detect subtle issues in logic, coherence, spelling, and grammar of the model's responses. With other projects, specifically focused on factuality, I have had to validate every single claim in a model's response, and locate and cite reliable and credible sources to support or contradict claims made in the model's response. Other projects would involve "stress testing" the models, by writing my own prompts (and sometimes my own system prompts too) and targeting the model's weaknesses, attempting to cause instruction following, factuality, or formatting issues. I have good experience producing rubrics and answer keys for model responses, and editing a model's response into the "golden response" by removing all issues and writing the new, golden response in markdown. Additionally, I have worked on other more niche projects, which include a couple of projects involving image analysis, where I'd analyse AI generated images for anatomical mistakes, unintelligible text, and style inconsistencies, and I have worked on a couple of voice projects, where I'd contribute voice data to train models, helping improve their ability to recognise and understand different accents, speech patterns, and voices, in quiet or loud environments. Lastly, recently I have been working on a project where an AI model has access to various documents (emails, Google Docs, Google Slides, Google Sheets) and is often asked to summarise or rewrite the documents they have access to, and I'd have to verify the model's responses are fully grounded, truthful, helpful, and of a high quality.

Entry LevelEnglish

Labeling Experience

Source Verification

Data Annotation TechTextEvaluation Rating

This project involved confirming whether webpages and sources used by a model to produce their response, actually support the information that the model presented in its response. This project was fairly straightforward, and just required me to confirm that the cited webpages actually include relevant and factually accurate information, alongside confirming that these sources are reliable and don't contain biases of their own.

2025 - 2025

Stress Test Projects

Data Annotation TechTextQuestion AnsweringText Generation

This project involved having a multi-turn conversation with an AI model, writing my own prompts and attempting to make it fail across a range of areas, but most importantly making it fail at instruction following (either explicit or implicit instructions given in the prompt), or truthfulness (responses presents factually incorrect information). This project was fairly simple early on when the model was in early stages of development, but became incredibly hard as the model became more advanced, as it became increasingly difficult to cause the model to fail.

2025

Rubric Production and Golden Response Editing

Data Annotation TechTextQuestion AnsweringEvaluation Rating

These are a range of projects, that saw me score a model's responses across a range of areas including instruction following, factuality, harmfulness, writing quality, verbosity, and overall quality. Next steps would be to produce a set of individual rubrics, that the perfect response would have to meet all of to be considered "perfect". Some of (but not all of) the rubric-based projects, involved editing the model's response to ensure that it meets all of the rubrics that I'd produced in the previous stage. AI responses must be written in Markdown.

2025

SxS Fact Checking

Data Annotation TechTextQuestion AnsweringEvaluation Rating

This project involved comparing two model responses side by side, with a specific focus on the factuality of both responses. I would have to verify every single claim made in both responses, and either prove of disprove each claim with a reliable citation (as claims could either be accurate, inaccurate, unsupported, or disputed), alongside scoring each response on instruction following, writing quality, harmfulness, verbosity, and overall quality. Detailed comments were made to explain my scoring for each response, and another comment was made to explain my decision on which response is better (and more factuality accurate) overall.

2025

SxS Overall Quality Projects

Data Annotation TechTextQuestion AnsweringEvaluation Rating

This project consists of assessing two (or more) model responses side by side, scoring them across a wide range of areas including instruction following, truthfulness, grounding, writing quality, verbosity, formatting, harmfulness, overall quality, and lastly, deciding which model response is the best. This project is massive and still ongoing, and this project still occasionally shows up on my dashboard

2025

Education

U

University of Portsmouth

Master of Science, Criminal Intelligence

Master of Science

2021 - 2022

U

University of Portsmouth

Bachelor of Science, Physics, Astrophysics, and Cosmology

Bachelor of Science

2018 - 2021

Work History

O

Oyster Partnership

Recruitment Consultant

London

2023 - 2025

F

Finseta

Associate FX Specialist

London

2023 - 2023