For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
E

English Antonio

Senior Machine Learning Engineer — RLHF Data Collection & AI Training

USA flag
Remote, Usa
$20.00/hrExpertScale AIOther

Key Skills

Software

Scale AIScale AI
Other

Top Subject Matter

Enterprise AI agents
Finance NLP models
Human preference data

Top Data Types

TextText
DocumentDocument

Top Task Types

RLHF
Fine Tuning

Freelancer Overview

Senior Machine Learning Engineer — RLHF Data Collection & AI Training. Brings 9+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Scale AI, Other, and IBM Research (Internal. Education includes Master of Science, University of Alabama at Birmingham (2018) and Bachelor of Science, University of Alabama at Birmingham (2016). AI-training focus includes data types such as Text and labeling workflows including RLHF and Fine-tuning.

ExpertEnglish

Labeling Experience

Scale AI

Senior Machine Learning Engineer — RLHF Data Collection & AI Training

Scale AITextRLHF
Designed and implemented RLHF pipelines for fine-tuning production-grade large language models in an enterprise context. Led end-to-end human preference data collection, reward model training, and PPO-based policy optimization cycles. Focused on AI agent behavior improvement and reduction of factual errors in domain-specific finance applications. • Developed custom reward models and preference datasets for RLHF experiments • Collected and curated human-labeled preference pairs for optimization • Trained reward models to reflect enterprise requirements and user expectations • Contributed to measurable improvements in LLM factuality and reduction in hallucination rates

Designed and implemented RLHF pipelines for fine-tuning production-grade large language models in an enterprise context. Led end-to-end human preference data collection, reward model training, and PPO-based policy optimization cycles. Focused on AI agent behavior improvement and reduction of factual errors in domain-specific finance applications. • Developed custom reward models and preference datasets for RLHF experiments • Collected and curated human-labeled preference pairs for optimization • Trained reward models to reflect enterprise requirements and user expectations • Contributed to measurable improvements in LLM factuality and reduction in hallucination rates

2022 - Present

RLHF Fine-Tuning Pipeline for Finance Domain LLM

OtherTextRLHF
Built an end-to-end RLHF fine-tuning pipeline targeting finance domain LLMs for B2B agents. Collected and labeled 15,000 human preference pairs, trained a Bradley-Terry reward model, and fine-tuned using PPO optimization. Deployed the resulting model as a production agent with enhanced factuality and domain reasoning capabilities. • Designed data pipelines for large-scale human preference data labeling • Implemented RLHF methodology to optimize model behavior for finance tasks • Measured and validated factual error reduction on FinanceBench benchmark • Integrated agent deployment using LangChain framework and FastAPI API

Built an end-to-end RLHF fine-tuning pipeline targeting finance domain LLMs for B2B agents. Collected and labeled 15,000 human preference pairs, trained a Bradley-Terry reward model, and fine-tuned using PPO optimization. Deployed the resulting model as a production agent with enhanced factuality and domain reasoning capabilities. • Designed data pipelines for large-scale human preference data labeling • Implemented RLHF methodology to optimize model behavior for finance tasks • Measured and validated factual error reduction on FinanceBench benchmark • Integrated agent deployment using LangChain framework and FastAPI API

2024 - 2024

Machine Learning Engineer — Foundation Model Fine-Tuning & Data Annotation

TextFine Tuning
Led multiple large language model fine-tuning projects on legal, financial, and healthcare documents using supervised instruction techniques. Focused on domain adaptation for BERT and GPT-2 variants, ensuring high relevance and accuracy for enterprise applications. Collected, labeled, and curated specialized document corpora to support ongoing fine-tuning and evaluation cycles. • Employed supervised fine-tuning and instruction tuning for multiple model architectures • Curated and labeled domain-specific textual datasets for training and validation • Collaborated with teams to translate requirements into actionable labeling tasks • Established and maintained automated dataset management pipelines

Led multiple large language model fine-tuning projects on legal, financial, and healthcare documents using supervised instruction techniques. Focused on domain adaptation for BERT and GPT-2 variants, ensuring high relevance and accuracy for enterprise applications. Collected, labeled, and curated specialized document corpora to support ongoing fine-tuning and evaluation cycles. • Employed supervised fine-tuning and instruction tuning for multiple model architectures • Curated and labeled domain-specific textual datasets for training and validation • Collaborated with teams to translate requirements into actionable labeling tasks • Established and maintained automated dataset management pipelines

2020 - 2022

ML Engineer — Deep Learning NLP Data Annotation & Model Training

TextFine Tuning
Designed and executed fine-tuning and annotation processes for deep learning sequence-to-sequence NLP tasks on clinical text. Supported model training with labeled clinical documents for summarization, classification, and entity recognition, delivering state-of-the-art results. Authored publications and maintained annotated corpora for ongoing research initiatives. • Developed labeled clinical datasets to support NLP model evaluation and comparison • Annotated text for summarization, classification, and NER tasks using proprietary tools • Contributed to improvement of health informatics benchmarks through data labeling • Presented results and methodology at major NLP conferences

Designed and executed fine-tuning and annotation processes for deep learning sequence-to-sequence NLP tasks on clinical text. Supported model training with labeled clinical documents for summarization, classification, and entity recognition, delivering state-of-the-art results. Authored publications and maintained annotated corpora for ongoing research initiatives. • Developed labeled clinical datasets to support NLP model evaluation and comparison • Annotated text for summarization, classification, and NER tasks using proprietary tools • Contributed to improvement of health informatics benchmarks through data labeling • Presented results and methodology at major NLP conferences

2018 - 2020

Education

U

University of Alabama at Birmingham

Master of Science, Computer Science

Master of Science
2016 - 2018
U

University of Alabama at Birmingham

Bachelor of Science, Computer Science and Mathematics

Bachelor of Science
2012 - 2016

Work History

S

Scale AI

Senior Machine Learning Engineer

Remote
2022 - Present
I

IBM

Machine Learning Engineer – NLP & Foundation Models

Birmingham, AL
2020 - 2021