For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
J

Josue Valencia

Senior Statistical Content Developer – Anthropic AI Research Division

USA flag
Stanford, Usa
$30.00/hrExpertOtherScale AISurge AI

Key Skills

Software

Other
Scale AIScale AI
Surge AISurge AI
LabelboxLabelbox
CloudFactoryCloudFactory
AppenAppen

Top Subject Matter

Statistical content for large language model training
Statistical reasoning for LLM training
Statistical data for AI model evaluation

Top Data Types

TextText
ImageImage

Top Task Types

Question Answering

Freelancer Overview

Senior Statistical Content Developer – Anthropic AI Research Division. Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal, Proprietary Tooling, and Other. Education includes Associate's Degree, Foothill College (2025) and Doctor of Philosophy, Stanford University (2023). AI-training focus includes data types such as Text and labeling workflows including Question Answering.

ExpertEnglish

Labeling Experience

Senior Statistical Content Developer – Anthropic AI Research Division

TextQuestion Answering
I designed and authored over 150 original statistics question-answer pairs monthly, leveraging research publications to enhance LLM statistical reasoning. I crafted multimodal questions requiring integration of visual statistical content, such as graphs and mathematical notation, to systematically identify image-reading failures. I collaborated closely with ML engineers to ensure generated data effectively targeted and challenged model weaknesses in multimodal reasoning. • Maintained a 98% peer review accuracy for technical statistical content • Authored prompts specifically to address known LLM knowledge gaps • Piloted methods for exploiting multimodal reasoning deficiencies • Validated annotations through iterative feedback with the engineering team.

I designed and authored over 150 original statistics question-answer pairs monthly, leveraging research publications to enhance LLM statistical reasoning. I crafted multimodal questions requiring integration of visual statistical content, such as graphs and mathematical notation, to systematically identify image-reading failures. I collaborated closely with ML engineers to ensure generated data effectively targeted and challenged model weaknesses in multimodal reasoning. • Maintained a 98% peer review accuracy for technical statistical content • Authored prompts specifically to address known LLM knowledge gaps • Piloted methods for exploiting multimodal reasoning deficiencies • Validated annotations through iterative feedback with the engineering team.

2024 - Present

Lead Research Annotator - OpenAI Content Solutions

OtherTextQuestion Answering
I created over 200 high-complexity statistical reasoning scenarios derived from econometric and biostatistical literature to advance AI model training. I specialized in extracting insights from visual data such as figures, regression tables, and probability distributions to formulate multimodal statistical annotation tasks. I developed detailed annotation guidelines and pioneered methods to identify visual-only statistical information that challenges model limits. • Led a 15-person annotation team focused on statistical reasoning • Created benchmark datasets for visual statistical comprehension • Developed rubric and calibration guidelines for multimodal annotation • Drove methodology for testing LLM visual information processing limits.

I created over 200 high-complexity statistical reasoning scenarios derived from econometric and biostatistical literature to advance AI model training. I specialized in extracting insights from visual data such as figures, regression tables, and probability distributions to formulate multimodal statistical annotation tasks. I developed detailed annotation guidelines and pioneered methods to identify visual-only statistical information that challenges model limits. • Led a 15-person annotation team focused on statistical reasoning • Created benchmark datasets for visual statistical comprehension • Developed rubric and calibration guidelines for multimodal annotation • Drove methodology for testing LLM visual information processing limits.

2023 - 2024
Scale AI

Statistical Dataset Architect – Scale AI Academic Research Division

Scale AITextQuestion Answering
I authored more than 300 question-answer pairs from peer-reviewed studies, specifically targeting LLM vulnerabilities in interpreting advanced statistical visualizations like forest plots, correlation matrices, and survival curves. I performed rigorous quality assurance reviews to ensure accuracy, mathematical rigor, and high pedagogical value in all annotated content. I achieved a 96% inter-rater reliability score for technical accuracy through expert assessment and review cycles. • Focused on Bayesian statistics, causal inference, and experimental design annotation • Produced datasets addressing model failures in visual statistics comprehension • Managed technical documentation for annotation best practices • Guaranteed technical content met educational benchmarks for AI models.

I authored more than 300 question-answer pairs from peer-reviewed studies, specifically targeting LLM vulnerabilities in interpreting advanced statistical visualizations like forest plots, correlation matrices, and survival curves. I performed rigorous quality assurance reviews to ensure accuracy, mathematical rigor, and high pedagogical value in all annotated content. I achieved a 96% inter-rater reliability score for technical accuracy through expert assessment and review cycles. • Focused on Bayesian statistics, causal inference, and experimental design annotation • Produced datasets addressing model failures in visual statistics comprehension • Managed technical documentation for annotation best practices • Guaranteed technical content met educational benchmarks for AI models.

2023 - 2023

Remote Statistical Content Specialist – DataRobot AI Training Labs

OtherTextQuestion Answering
I generated original statistical problems and scenarios based on recent AI research publications in time series analysis, spatial statistics, and machine learning. I developed over 175 complex problems requiring combined interpretation of multiple figures, tables, and statistical notations to train multimodal models. I worked collaboratively with international statisticians to ensure content diversity and technical rigor for LLM evaluation. • Identified model failure patterns in interpreting images with embedded notation • Developed datasets emphasizing integration of text and visual information • Coordinated with global PhD statisticians for cross-validation • Pioneered challenging multimodal evaluation problem creation.

I generated original statistical problems and scenarios based on recent AI research publications in time series analysis, spatial statistics, and machine learning. I developed over 175 complex problems requiring combined interpretation of multiple figures, tables, and statistical notations to train multimodal models. I worked collaboratively with international statisticians to ensure content diversity and technical rigor for LLM evaluation. • Identified model failure patterns in interpreting images with embedded notation • Developed datasets emphasizing integration of text and visual information • Coordinated with global PhD statisticians for cross-validation • Pioneered challenging multimodal evaluation problem creation.

2022 - 2022
Surge AI

PhD Research Consultant – Surge AI Statistics Team

Surge AITextQuestion Answering
I created specialized question sets from epidemiological and clinical trial literature focusing on graphical data interpretation and LLM gap analysis. I designed over 120 problems to expose weaknesses in model understanding of statistical visuals, such as hazard ratios and odds ratios. I provided expert review for non-PhD content contributors and contributed to documentation on best practices for image-based statistical labeling. • Emphasized difficult visualization types like confidence interval graphs • Led technical peer review initiatives for annotation quality • Supported internal benchmarks for LLM comprehension improvement • Designed new processes for graphical reasoning dataset generation.

I created specialized question sets from epidemiological and clinical trial literature focusing on graphical data interpretation and LLM gap analysis. I designed over 120 problems to expose weaknesses in model understanding of statistical visuals, such as hazard ratios and odds ratios. I provided expert review for non-PhD content contributors and contributed to documentation on best practices for image-based statistical labeling. • Emphasized difficult visualization types like confidence interval graphs • Led technical peer review initiatives for annotation quality • Supported internal benchmarks for LLM comprehension improvement • Designed new processes for graphical reasoning dataset generation.

2021 - 2022

Education

S

Stanford University

Doctor of Philosophy, Statistics

Doctor of Philosophy
2018 - 2023
U

University of California, Berkeley

Master of Science, Applied Statistics

Master of Science
2016 - 2018

Work History

S

Stanford University

Graduate Teaching Assistant & Remote Content Developer

Stanford
2018 - 2019