For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
R

Robert Jacksonny

Lead AI Evaluator

New York
Expert

Key Skills

Software

No software listed

Top Subject Matter

General NLP
Medical Domain Expertise
Legal Services & Contract Review

Top Data Types

TextText
DocumentDocument

Top Task Types

RLHFRLHF
Fine-tuningFine-tuning
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)
Red TeamingRed Teaming

Freelancer Overview

Lead AI Evaluator. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Science, State University of New York and Certificate, DeepLearning.AI. AI-training focus includes data types such as Text, Computer Code, and Programming and labeling workflows including RLHF, Fine-tuning, and Prompt + Response Writing (SFT).

Expert

Labeling Experience

Lead AI Evaluator

TextRLHF
As Lead AI Evaluator at Neural-Logic Systems, I was responsible for evaluating and ranking model outputs through RLHF methodologies. I assessed thousands of AI-generated completions for truthfulness, helpfulness, and harmlessness, supporting reward model improvement. Additionally, I led efforts in red-teaming, quality benchmarking, and vulnerability identification. • Ranked over 15,000 model completions to improve RLHF reward models. • Led red-teaming exercises focused on discovering edge-case vulnerabilities in medical and legal queries. • Authored internal quality rubrics for annotator teams. • Improved model accuracy and safety through systematic data evaluation.

As Lead AI Evaluator at Neural-Logic Systems, I was responsible for evaluating and ranking model outputs through RLHF methodologies. I assessed thousands of AI-generated completions for truthfulness, helpfulness, and harmlessness, supporting reward model improvement. Additionally, I led efforts in red-teaming, quality benchmarking, and vulnerability identification. • Ranked over 15,000 model completions to improve RLHF reward models. • Led red-teaming exercises focused on discovering edge-case vulnerabilities in medical and legal queries. • Authored internal quality rubrics for annotator teams. • Improved model accuracy and safety through systematic data evaluation.

2024 - Present

Red Team Data Labeling Contributor (Adversarial Safety Suite)

TextRed Teaming
As part of the Adversarial Safety Suite project, I designed and authored adversarial prompts to test and improve LLM safety guardrails. My efforts focused on generating content constructed to bypass model safeguards, exposing critical AI vulnerabilities. The project identified key risks and directly influenced mitigation before product release. • Created over 800 adversarial attack prompts ('jailbreaks' and persona simulation). • Specialized in prompt-based safety and filter circumvention tests. • Targeted critical safety gaps in LLM filters. • Ensured vulnerabilities were addressed prior to public deployment.

As part of the Adversarial Safety Suite project, I designed and authored adversarial prompts to test and improve LLM safety guardrails. My efforts focused on generating content constructed to bypass model safeguards, exposing critical AI vulnerabilities. The project identified key risks and directly influenced mitigation before product release. • Created over 800 adversarial attack prompts ('jailbreaks' and persona simulation). • Specialized in prompt-based safety and filter circumvention tests. • Targeted critical safety gaps in LLM filters. • Ensured vulnerabilities were addressed prior to public deployment.

2023 - 2023

AI Training Project Contributor (Logic-Leap Debugger)

TextPrompt Response Writing SFT
For the 'Logic-Leap' Debugger project, I authored hundreds of chain-of-thought examples to guide LLM solutions for math and logic challenges. My structured data creation enforced stepwise logical reasoning in LLM behavior. This contributed measurable gains on internal math benchmarks. • Developed over 1,200 chain-of-thought prompt+response sets for reasoning tasks. • Focused on internal GSM8K-style math and logic benchmarks. • Enhanced LLM robustness for complex multi-step problem types. • Advanced logic tutoring strategies for model explainability.

For the 'Logic-Leap' Debugger project, I authored hundreds of chain-of-thought examples to guide LLM solutions for math and logic challenges. My structured data creation enforced stepwise logical reasoning in LLM behavior. This contributed measurable gains on internal math benchmarks. • Developed over 1,200 chain-of-thought prompt+response sets for reasoning tasks. • Focused on internal GSM8K-style math and logic benchmarks. • Enhanced LLM robustness for complex multi-step problem types. • Advanced logic tutoring strategies for model explainability.

2023 - 2023

Technical Data Curator

Fine Tuning
In my role as Technical Data Curator at Syntax & Soul Media, I curated, prepared, and annotated datasets focused on coding prompts to enable LLM fine-tuning. My work prioritized dataset optimization for developer tools and reduced unproductive model hallucination. I managed SFT pipelines and ensured adherence to safety and brand guidelines. • Curated over 5,000 annotated prompts for fine-tuning code LLMs. • Developed prompt chaining strategies to reduce hallucination in summarization tasks. • Managed Supervised Fine-Tuning operations for chatbots. • Ensured 99% brand voice and safety compliance in training data.

In my role as Technical Data Curator at Syntax & Soul Media, I curated, prepared, and annotated datasets focused on coding prompts to enable LLM fine-tuning. My work prioritized dataset optimization for developer tools and reduced unproductive model hallucination. I managed SFT pipelines and ensured adherence to safety and brand guidelines. • Curated over 5,000 annotated prompts for fine-tuning code LLMs. • Developed prompt chaining strategies to reduce hallucination in summarization tasks. • Managed Supervised Fine-Tuning operations for chatbots. • Ensured 99% brand voice and safety compliance in training data.

2022 - 2023

Education

N

N/A

Certificate, Prompt Engineering

Certificate
Not specified
D

DeepLearning.AI

Certificate, Artificial Intelligence

Certificate
Not specified

Work History

No Work History added yet

Robert J. hasn’t added any Work History to their OpenTrain profile yet.