For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
S

Santiago Barbosa Virguez

Integrated and fine-tuned LLMs with focus on training data quality and pipeline optimization

Colombia flagBogotá, Colombia
$40.00/hrExpertAws Sagemaker

Key Skills

Software

AWS SageMakerAWS SageMaker

Top Subject Matter

Large Language Models and NLP
NLP Model Training and Integration

Top Data Types

TextText

Top Task Types

Fine Tuning

Freelancer Overview

Integrated and fine-tuned LLMs with focus on training data quality and pipeline optimization. Brings 15+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include AWS SageMaker. Education includes Bachelor of Science, Universidad Nacional de Colombia (2012). AI-training focus includes data types such as Text and labeling workflows including Fine-tuning.

ExpertEnglish

Labeling Experience

AWS SageMaker

Integrated and fine-tuned LLMs with focus on training data quality and pipeline optimization

Aws SagemakerTextFine Tuning
I collaborated closely with data science teams to integrate, fine-tune, and optimize large language models (LLMs) by preparing and curating relevant datasets. My responsibilities included ensuring data quality for model training, facilitating machine learning best practices, and supporting regular updates to AI systems. I frequently leveraged tools and cloud-based solutions to manage the data and streamline the training pipeline.• Integrated and prepared text data for machine learning model training and fine-tuning. • Worked on ensuring high-quality, diverse, and relevant datasets for LLM evaluation and updates. • Contributed to model versioning, deployment, and ongoing quality improvement initiatives. • Utilized AWS SageMaker and Internal/Proprietary Tooling for data management and training workflows.

I collaborated closely with data science teams to integrate, fine-tune, and optimize large language models (LLMs) by preparing and curating relevant datasets. My responsibilities included ensuring data quality for model training, facilitating machine learning best practices, and supporting regular updates to AI systems. I frequently leveraged tools and cloud-based solutions to manage the data and streamline the training pipeline.• Integrated and prepared text data for machine learning model training and fine-tuning. • Worked on ensuring high-quality, diverse, and relevant datasets for LLM evaluation and updates. • Contributed to model versioning, deployment, and ongoing quality improvement initiatives. • Utilized AWS SageMaker and Internal/Proprietary Tooling for data management and training workflows.

2019 - Present
AWS SageMaker

Prepared and labeled training data for LLMs and NLP models

Aws SagemakerTextFine Tuning
I worked closely with machine learning researchers to prepare, label, and quality-assure large text datasets for NLP model development and integration. My efforts were focused on improving LLM accuracy and accelerating experimentation through model training and continuous data ingestion. I leveraged AWS SageMaker and custom internal tools to streamline and automate the training and evaluation process.• Assisted with preparation and curation of training data for deep learning NLP models. • Supported fine-tuning and evaluation of LLMs by providing annotated text data. • Automated and monitored model training pipelines for efficiency and scalability. • Collaborated on integrating NLP models and maintaining dataset integrity.

I worked closely with machine learning researchers to prepare, label, and quality-assure large text datasets for NLP model development and integration. My efforts were focused on improving LLM accuracy and accelerating experimentation through model training and continuous data ingestion. I leveraged AWS SageMaker and custom internal tools to streamline and automate the training and evaluation process.• Assisted with preparation and curation of training data for deep learning NLP models. • Supported fine-tuning and evaluation of LLMs by providing annotated text data. • Automated and monitored model training pipelines for efficiency and scalability. • Collaborated on integrating NLP models and maintaining dataset integrity.

2016 - 2019

Education

U

Universidad Nacional de Colombia

Bachelor of Science, Computer Science

Bachelor of Science
2008 - 2012

Work History

M

MetaLab

Senior Software Engineer

Bogotá
2019 - Present
A

Amazon

Senior Software Engineer

Seattle
2016 - 2019