Jordan Klein - Generalist Evaluator Expert (LLM & Multimodal Models)

Key Skills

Software

Labelbox

Appen

Remotasks

Top Subject Matter

LLMs and Multimodal AI

NLP and Digital Platforms

Computer Vision and Multimodal AI

Top Data Types

Text

Image

Audio

Video

Document

Top Task Types

Entity Ner Classification

Object Detection

Transcription

Prompt Response Writing SFT

Freelancer Overview

Generalist Evaluator Expert (LLM & Multimodal Models). Brings 7+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Labelbox, Appen, and Remotasks. Education includes Master of Science, Stanford University (2022) and Bachelor of Science, Harvard University (2020). AI-training focus includes data types such as Text, Image, and Audio and labeling workflows including Evaluation, Rating, and Entity (NER) Classification.

ExpertEnglish

Labeling Experience

Multimodal AI Training Contributor (Video-to-Text)

LabelboxVideoPrompt Response Writing SFT

On the Multimodal AI Training project, I generated video transformation instructions and contributed to the creation of advanced datasets. My tasks covered structured prompt design, scene description, and annotation for training multimodal systems. I improved AI capabilities by ensuring detailed and accurate video-to-text transformation. • Generated multi-step instructions for video annotation. • Designed prompts and structured annotation flows. • Conducted transformation alignment for video datasets. • Enhanced multimodal model training pipelines.

2021 - Present

Generalist Evaluator Expert (LLM & Multimodal Models)

LabelboxText

As a Generalist Evaluator Expert, I evaluated large language model (LLM) outputs and assessed their quality. My work included reviewing reasoning, summarization, translation, safety, and user intent alignment for advanced AI systems. I performed comparative and rubric-based assessments to enhance model performance and safety. • Identified factual accuracy, hallucinations, and content risk in LLM outputs. • Designed prompts, test scenarios, and evaluation datasets. • Delivered qualitative insights for model tuning. • Supported safety optimization of AI systems.

2021 - Present

Audio Model Trainer (ASR/TTS)

AudioTranscription

As an Audio Model Trainer, I transcribed, diarized, and segmented audio to train and evaluate ASR/TTS models. I worked extensively with varied audio conditions and evaluated ASR outputs for fluency, timing, and contextual alignment. My annotations supported model robustness and engineering team requirements. • Transcribed and diarized speech data for ASR/TTS. • Evaluated audio for accuracy and alignment. • Annotated metadata to improve model training. • Handled complex audio environments and multi-speaker scenarios.

2020 - Present

AI Data Annotation Specialist (Computer Vision & Multimodal)

RemotasksImageObject Detection

In this capacity, I annotated and processed vision datasets for computer vision tasks involving object detection and caption alignment. My duties included managing image annotation workflows and contributing to generative AI training via structured scene descriptions and transformation instructions. I ensured data quality by identifying errors and updating guidelines. • Labeled and processed object detection data. • Conducted scene descriptions and video-to-text transformation instructions. • Improved dataset quality through systematic review. • Maintained exceptional accuracy and consistency.

2020 - Present

AI Data Annotation Specialist (NLP)

AppenTextEntity Ner Classification

In my role as an AI Data Annotation Specialist, I labeled and validated a high volume of samples across NLP and related workflows. My responsibilities included annotating data for entity recognition, sentiment analysis, intent detection, and classification. I consistently maintained high accuracy and contributed to quality assurance on multiple NLP projects. • Performed labeling for NER, sentiment, intent, and classification. • Processed and refined large-scale NLP datasets. • Identified inconsistencies and revised annotation guidelines. • Ensured 99% accuracy across all annotation tasks.

2020 - Present

Education

S

Stanford University

Master of Science, Machine Learning and Artificial Intelligence

Master of Science

2020 - 2022

H

Harvard University

Bachelor of Science, Computer Science

Bachelor of Science

2016 - 2020

Work History

S

Southeastern Community College

Nursing Training Trainee

N/A

2023 - Present

F

Freelance

Remote Research Participant

N/A

2020 - Present