Josue Valencia - Senior Statistical Content Developer – Anthropic AI Research Division

Key Skills

Software

Other

Scale AI

Surge AI

Labelbox

CloudFactory

Appen

Top Subject Matter

Statistical content for large language model training

Statistical reasoning for LLM training

Statistical data for AI model evaluation

Top Data Types

Text

Image

Top Task Types

Question Answering

Freelancer Overview

Senior Statistical Content Developer – Anthropic AI Research Division. Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal, Proprietary Tooling, and Other. Education includes Associate's Degree, Foothill College (2025) and Doctor of Philosophy, Stanford University (2023). AI-training focus includes data types such as Text and labeling workflows including Question Answering.

ExpertEnglish

Labeling Experience

Senior Statistical Content Developer – Anthropic AI Research Division

TextQuestion Answering

I designed and authored over 150 original statistics question-answer pairs monthly, leveraging research publications to enhance LLM statistical reasoning. I crafted multimodal questions requiring integration of visual statistical content, such as graphs and mathematical notation, to systematically identify image-reading failures. I collaborated closely with ML engineers to ensure generated data effectively targeted and challenged model weaknesses in multimodal reasoning. • Maintained a 98% peer review accuracy for technical statistical content • Authored prompts specifically to address known LLM knowledge gaps • Piloted methods for exploiting multimodal reasoning deficiencies • Validated annotations through iterative feedback with the engineering team.

2024 - Present

Lead Research Annotator - OpenAI Content Solutions

OtherTextQuestion Answering

I created over 200 high-complexity statistical reasoning scenarios derived from econometric and biostatistical literature to advance AI model training. I specialized in extracting insights from visual data such as figures, regression tables, and probability distributions to formulate multimodal statistical annotation tasks. I developed detailed annotation guidelines and pioneered methods to identify visual-only statistical information that challenges model limits. • Led a 15-person annotation team focused on statistical reasoning • Created benchmark datasets for visual statistical comprehension • Developed rubric and calibration guidelines for multimodal annotation • Drove methodology for testing LLM visual information processing limits.

2023 - 2024

Statistical Dataset Architect – Scale AI Academic Research Division

Scale AITextQuestion Answering

I authored more than 300 question-answer pairs from peer-reviewed studies, specifically targeting LLM vulnerabilities in interpreting advanced statistical visualizations like forest plots, correlation matrices, and survival curves. I performed rigorous quality assurance reviews to ensure accuracy, mathematical rigor, and high pedagogical value in all annotated content. I achieved a 96% inter-rater reliability score for technical accuracy through expert assessment and review cycles. • Focused on Bayesian statistics, causal inference, and experimental design annotation • Produced datasets addressing model failures in visual statistics comprehension • Managed technical documentation for annotation best practices • Guaranteed technical content met educational benchmarks for AI models.

2023 - 2023

Remote Statistical Content Specialist – DataRobot AI Training Labs

OtherTextQuestion Answering

I generated original statistical problems and scenarios based on recent AI research publications in time series analysis, spatial statistics, and machine learning. I developed over 175 complex problems requiring combined interpretation of multiple figures, tables, and statistical notations to train multimodal models. I worked collaboratively with international statisticians to ensure content diversity and technical rigor for LLM evaluation. • Identified model failure patterns in interpreting images with embedded notation • Developed datasets emphasizing integration of text and visual information • Coordinated with global PhD statisticians for cross-validation • Pioneered challenging multimodal evaluation problem creation.

2022 - 2022

PhD Research Consultant – Surge AI Statistics Team

Surge AITextQuestion Answering

I created specialized question sets from epidemiological and clinical trial literature focusing on graphical data interpretation and LLM gap analysis. I designed over 120 problems to expose weaknesses in model understanding of statistical visuals, such as hazard ratios and odds ratios. I provided expert review for non-PhD content contributors and contributed to documentation on best practices for image-based statistical labeling. • Emphasized difficult visualization types like confidence interval graphs • Led technical peer review initiatives for annotation quality • Supported internal benchmarks for LLM comprehension improvement • Designed new processes for graphical reasoning dataset generation.

2021 - 2022

Education

S

Stanford University

Doctor of Philosophy, Statistics

Doctor of Philosophy

2018 - 2023

U

University of California, Berkeley

Master of Science, Applied Statistics

Master of Science

2016 - 2018

Work History

S

Stanford University

Graduate Teaching Assistant & Remote Content Developer

Stanford

2018 - 2019