For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
J

John Mcgurk

AI Evaluation & Annotation Specialist

USA flagHouston, Usa
ExpertRemotasksLabelboxCVAT

Key Skills

Software

RemotasksRemotasks
LabelboxLabelbox
CVATCVAT
SuperviselySupervisely

Top Subject Matter

Multimodal LLM output evaluation and dataset QA
Supervised learning and ML model data preparation

Top Data Types

TextText
ImageImage

Top Task Types

RLHF

Freelancer Overview

AI Evaluation & Annotation Specialist. Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Outlier AI, Remotasks, and Labelbox. Education includes Bachelor of Science, University of Chicago (2020). AI-training focus includes data types such as Text and labeling workflows including RLHF, Evaluation, and Rating.

Expert

Labeling Experience

AI Evaluation & Annotation Specialist

TextRLHF
As an AI Evaluation & Annotation Specialist, I performed Reinforcement Learning from Human Feedback (RLHF) and multimodal data labeling across text, image, and video tasks. I consistently maintained above-benchmark inter-annotator agreement and robust dataset quality assurance in high-throughput remote workflows. I ensured rigorous calibration, prompt failure documentation, and comparative analysis across multiple large language models. • Annotated and validated 70,000+ data points spanning text, images, and videos using RLHF frameworks • Conducted comparative A/B evaluations on Claude, GPT-4o, Gemini, Grok, and Perplexity output • Performed in-depth quality audits, flagging safety violations, unsupported claims, and instruction-following errors • Leveraged tools including Outlier AI, Remotasks, Labelbox, CVAT, Supervisely, and V7 Darwin

As an AI Evaluation & Annotation Specialist, I performed Reinforcement Learning from Human Feedback (RLHF) and multimodal data labeling across text, image, and video tasks. I consistently maintained above-benchmark inter-annotator agreement and robust dataset quality assurance in high-throughput remote workflows. I ensured rigorous calibration, prompt failure documentation, and comparative analysis across multiple large language models. • Annotated and validated 70,000+ data points spanning text, images, and videos using RLHF frameworks • Conducted comparative A/B evaluations on Claude, GPT-4o, Gemini, Grok, and Perplexity output • Performed in-depth quality audits, flagging safety violations, unsupported claims, and instruction-following errors • Leveraged tools including Outlier AI, Remotasks, Labelbox, CVAT, Supervisely, and V7 Darwin

2022 - Present

Machine Learning Intern

Text
I contributed to the supervised learning pipeline with data labeling and model output evaluation for ML training. My tasks included dataset annotation, performance validation, and reviewing model predictions for errors or inconsistencies. I also played a part in collaborative error analysis and validation efforts. • Participated in model output review and dataset annotation • Assisted in dataset preparation and supervised learning validation • Performed error identification and classification • Contributed to feature engineering and validation workflows

I contributed to the supervised learning pipeline with data labeling and model output evaluation for ML training. My tasks included dataset annotation, performance validation, and reviewing model predictions for errors or inconsistencies. I also played a part in collaborative error analysis and validation efforts. • Participated in model output review and dataset annotation • Assisted in dataset preparation and supervised learning validation • Performed error identification and classification • Contributed to feature engineering and validation workflows

2018 - 2019

Education

U

University of Chicago

Bachelor of Science, Artificial Intelligence

Bachelor of Science
2020 - 2020

Work History

D

DataMind Labs

Machine Learning Intern

Houston
2018 - 2019