For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Larry Arnold

Larry Arnold

Independent AI Researcher and LLM Evaluator

USA flag
Catlin, Usa
$50.00/hrIntermediateGoogle Cloud Vertex AIRemotasks

Key Skills

Software

Google Cloud Vertex AIGoogle Cloud Vertex AI
RemotasksRemotasks

Top Subject Matter

AI Model Evaluation and Red Teaming

Top Data Types

TextText
Computer Code ProgrammingComputer Code Programming

Top Task Types

Text Generation
Question Answering
Object Detection
Red Teaming
Computer Programming Coding
Prompt Response Writing SFT
Function Calling
Data Collection
RLHF
Text Summarization
Transcription
Classification

Freelancer Overview

Independent AI Researcher and LLM Evaluator. Brings 4+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Science, Maryville University (2024) and Undergraduate Certificate, Maryville University (2024). AI-training focus includes data types such as Text and labeling workflows including Evaluation and Rating.

IntermediateEnglish

Labeling Experience

Independent AI Researcher and LLM Evaluator

Text
Conducted structured evaluations of large language models (LLMs) to assess reasoning accuracy, hallucination risk, and behavioral consistency. Designed and implemented adversarial testing and prompt-based experiments targeting alignment weaknesses and vulnerabilities. Developed, executed, and documented 500+ structured prompts across multiple LLM platforms using Python automation tools. • Created comprehensive experiment libraries for analysis of LLM output reliability and edge-case failures. • Benchmarked and compared model performance across logical reasoning, knowledge generation, and research assistance tasks. • Developed internal tools for prompt testing, behavioral data collection, and experiment logging. • Applied evaluation frameworks to both proprietary and open-source models including ChatGPT, Claude, and Gemini.

Conducted structured evaluations of large language models (LLMs) to assess reasoning accuracy, hallucination risk, and behavioral consistency. Designed and implemented adversarial testing and prompt-based experiments targeting alignment weaknesses and vulnerabilities. Developed, executed, and documented 500+ structured prompts across multiple LLM platforms using Python automation tools. • Created comprehensive experiment libraries for analysis of LLM output reliability and edge-case failures. • Benchmarked and compared model performance across logical reasoning, knowledge generation, and research assistance tasks. • Developed internal tools for prompt testing, behavioral data collection, and experiment logging. • Applied evaluation frameworks to both proprietary and open-source models including ChatGPT, Claude, and Gemini.

2023 - Present

Education

M

Maryville University

Undergraduate Certificate, Artificial Intelligence

Undergraduate Certificate
2024
M

Maryville University

Bachelor of Science, Computer Science

Bachelor of Science
2024

Work History

S

Self-Directed

Independent AI Researcher and LLM Evaluator

Catlin
2023 - Present