For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Gowtham S

Gowtham S

Machine Learning Research Engineer - Artificial Intelligence

INDIA flag
Chennai, India
$10.00/hrIntermediateOther

Key Skills

Software

Other

Top Subject Matter

No subject matter listed

Top Data Types

TextText

Top Label Types

Segmentation
Question Answering
Text Generation

Freelancer Overview

I am a Machine Learning Engineer with hands-on experience in large language model (LLM) training, data labeling, and dataset curation for NLP applications. My work includes preparing and refining large-scale datasets using advanced pipelines such as CC-Net, GoClassy, and Ungoliant, with custom modifications to enhance data filtering and quality scoring. I have developed and trained classifiers to label pretraining data by quality, ensuring high standards for AI training data. My technical toolkit spans Python, Rust, PyTorch, and Hugging Face, and I have designed end-to-end data pipelines for model pretraining, fine-tuning, and evaluation. I am passionate about optimizing data workflows, implementing scalable preprocessing systems, and leveraging retrieval-augmented generation (RAG) for domain-specific applications, including finance and code generation. My background in model architecture, dataset collection from diverse sources, and real-world deployment of AI solutions enables me to deliver high-quality, reliable training data for advanced machine learning projects.

IntermediateEnglishTamil

Labeling Experience

SFT & RLHF Data labelling

OtherTextSegmentationQuestion Answering
For finetuning for the model we have prepared a 100k Q&A pair of the document for the model training (7B params) and samewise for reinforcment learning with human feedback we have collect a response / reject dataset segementation work we have collected around 12k docs.

For finetuning for the model we have prepared a 100k Q&A pair of the document for the model training (7B params) and samewise for reinforcment learning with human feedback we have collect a response / reject dataset segementation work we have collected around 12k docs.

2025 - 2025

Education

J

Jerusalem College of Engineering

Bachelor of Engineering, Computer Science

Bachelor of Engineering
2020 - 2024
G

Guru Nanak Matric Higher Secondary School

Higher Secondary School Certificate, N/A

Higher Secondary School Certificate
2018 - 2020

Work History

B

BluBridge

Machine Learning Research Engineer

Chennai
2025 - Present
Z

Zeb

Machine Learning Engineer

Chennai
2024 - 2025