Daniel Murphy - Versatile LLM Evaluator Rewriter & Tester

Key Skills

Software

Other

Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

Audio

Image

Text

Top Task Types

Classification

Evaluation Rating

Prompt Response Writing SFT

RLHF

Translation Localization

Freelancer Overview

I have experience in AI training data and evaluation, specialising in refining and testing LLMs to ensure accuracy, ethical compliance, and high-quality output. My work includes conducting RHLF testing, designing and refining prompts to assess model reasoning, and providing actionable feedback to improve performance. Additionally, I have rigorously evaluated AI-generated outputs for safety and compliance, contributing to the development of reliable and responsible AI systems. My expertise extends to leveraging my strong analytical and communication skills to enhance model performance and to align outputs with user expectations. I bring a detail-oriented and methodical approach to data labelling and AI testing, supported by hands-on experience in optimising processes, identifying issues, and driving continuous improvements. These experiences set me apart as a skilled and reliable contributor to advancing AI technologies.

Entry LevelEnglishSpanish

Labeling Experience

Safety RHLF Testing

Internal Proprietary ToolingImageQuestion AnsweringText Generation

In this project, I focused on RHLF safety testing to evaluate AI model outputs for adherence to safety and ethical standards. Key tasks included: Conducting rigorous assessments of AI-generated responses to identify and prevent harmful, biased, or inappropriate content. Developing and applying test cases to simulate real-world scenarios, ensuring the AI handled sensitive or complex topics responsibly. Providing actionable feedback to enhance the model's compliance with safety protocols and ethical guidelines.

2024

RHLF Testing

Internal Proprietary ToolingTextQuestion AnsweringText Generation

In this project, I performed RHLF testing to evaluate AI model performance, focusing on: Designing and refining prompts to assess model reasoning, factual accuracy, and ethical alignment. Conducting detailed evaluations of AI-generated responses, ensuring outputs met quality, clarity, and relevance standards. Identifying areas for improvement and providing structured feedback to optimise model behaviour and safety.

2024

Audio Based Testing and Labelling

Internal Proprietary ToolingAudioText GenerationEmotion Recognition

In this project, I conducted rigorous audio-based testing to evaluate the quality, relevance, and accuracy of AI-generated outputs. My work involved: Assessing AI performance across multiple dimensions, including factual accuracy, contextual relevance, and ethical alignment. Providing detailed annotations and actionable feedback to improve model outputs. Identifying and tagging areas of improvement, ensuring compliance with safety and ethical guidelines.

2024

Education

C

Canterbury Christchurch University

PGCE, Secondary Education (History)

PGCE

2014 - 2015

N

Nottingham Trent University

BA (Hons), History and International Relations

BA (Hons)

2009 - 2012

Work History

O

Outlier AI

LLM Rewriter & Tester

London

2024 - Present

B

British Land

Assistant Property Manager

London

2024 - 2024