Alejandro Diaz - Data Scientist & ML Engineer

Key Skills

Software

Appen

Clickworker

Data Annotation Tech

Labelbox

Remotasks

Scale AI

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code Programming

Document

Medical Dicom

Top Task Types

Classification

Computer Programming Coding

Geocoding

Prompt Response Writing SFT

Text Generation

Freelancer Overview

An experienced AI Training Data Specialist and Data Analyst, I have worked extensively in curating, annotating, and optimizing datasets for machine learning models. I have trained LLMs for Python on platforms like RemoTasks and Toloka and contributed to marketing-focused AI models on Outlier. My expertise spans data labeling, preprocessing, and quality assurance, ensuring high-accuracy datasets for AI training. Additionally, my background in data engineering, machine learning, and algorithmic trading allows me to approach AI training with a strategic, data-driven mindset, optimizing model performance through high-quality, well-structured data.

ExpertEnglishItalianVietnameseChinese Mandarin

Labeling Experience

LLM Data Annotation and Fine-Tuning

Scale AITextEntity Ner ClassificationClassification

As an AI Training Data Specialist, I have contributed to multiple projects focused on enhancing LLM performance through high-quality data labeling. My work includes annotating and classifying text for natural language understanding, fine-tuning models using Reinforcement Learning from Human Feedback (RLHF), and performing red teaming to identify model vulnerabilities. I have also engaged in function calling annotations and supervised fine-tuning (SFT) for prompt-response optimization. Adhering to strict quality guidelines, I ensured high-accuracy labeling to improve AI comprehension, contextual accuracy, and ethical considerations in model deployment.

2024

Medical & Biological LLM Data Annotation

Scale AITextEntity Ner ClassificationClassification

I contributed to training and fine-tuning medical and biological LLMs by labeling complex medical texts, patient records, and clinical literature. My tasks included medical entity recognition (NER) for diseases, symptoms, and treatments, as well as annotating diagnosis-related text for AI-powered decision support. I also performed classification and summarization of research papers, ensuring AI models could extract key insights accurately. Additionally, I worked on supervised fine-tuning (SFT) and evaluation of AI-generated medical responses, focusing on factual correctness, ethical considerations, and adherence to clinical guidelines.

2023 - 2024

Education

M

Medical School in the Philippines

Doctor of Medicine (MD), Medicine

Doctor of Medicine (MD)

2022 - 2025

O

Ohio State University

Bachelor of Science in Computer Science, Computer Science

Bachelor of Science in Computer Science

2021 - 2022

Work History

T

Tech Solutions

Machine Learning Engineer

Columbus

2022 - Present