For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
L
Lalith Sai

Lalith Sai

AI Training Data – RAG Evaluation Dataset

India flagTirupati, India
$20.00/hrIntermediateLabelimgImg LabHumanatic

Key Skills

Software

LabelImgLabelImg
Img Lab
HumanaticHumanatic
Google Cloud Vertex AIGoogle Cloud Vertex AI
DataloopDataloop
Data Annotation TechData Annotation Tech
Snorkel AISnorkel AI

Top Subject Matter

LLM Evaluation
RAG Chatbot
Plant Disease Classification

Top Data Types

TextText
ImageImage
Computer Code ProgrammingComputer Code Programming

Top Task Types

ClassificationClassification
Fine-tuningFine-tuning
Data CollectionData Collection
Computer Programming/CodingComputer Programming/Coding
Object DetectionObject Detection

Freelancer Overview

AI Training Data – RAG Evaluation Dataset. Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal, Proprietary Tooling, and OpenCV. Education includes Bachelor of Technology, Sree Rama Engineering College (2022). AI-training focus includes data types such as Text, Image, and Structured and labeling workflows including Evaluation, Rating, and Classification.

IntermediateEnglishTelugu

Labeling Experience

Image Dataset Annotation – Plant Disease Classification

ImageClassification
I annotated and preprocessed a multi-class image dataset of crop leaves for a plant disease classification project. Labels for healthy, diseased, and specific disease types were applied consistently, with rigorous quality controls in place. Image augmentation techniques expanded the labeled set to support model generalization and improve accuracy. • Maintained high dataset integrity by removing poor-quality or ambiguous images. • Utilized OpenCV and PyTorch for data augmentation and preprocessing. • Validated annotation consistency before model training and testing. • Demonstrated the impact of label quality on computer vision model results.

I annotated and preprocessed a multi-class image dataset of crop leaves for a plant disease classification project. Labels for healthy, diseased, and specific disease types were applied consistently, with rigorous quality controls in place. Image augmentation techniques expanded the labeled set to support model generalization and improve accuracy. • Maintained high dataset integrity by removing poor-quality or ambiguous images. • Utilized OpenCV and PyTorch for data augmentation and preprocessing. • Validated annotation consistency before model training and testing. • Demonstrated the impact of label quality on computer vision model results.

2025 - 2025

AI Training Data – RAG Evaluation Dataset

Text
This role focused on creating a robust Q&A evaluation dataset for an LLM-powered retrieval system. I manually crafted and annotated over 250 Q&A pairs to test diverse retrieval scenarios, tagging source documents by topic, type, and relevance. Annotation guidelines were developed and refined iteratively to maximize label quality and model performance. • Developed and documented annotation guidelines for consistency. • Tagged documents with rich metadata to support filtered search and retrieval accuracy. • Used evaluation sets to identify hallucination patterns and improve pipeline quality. • Continuously improved labeling based on model outputs and performance feedback.

This role focused on creating a robust Q&A evaluation dataset for an LLM-powered retrieval system. I manually crafted and annotated over 250 Q&A pairs to test diverse retrieval scenarios, tagging source documents by topic, type, and relevance. Annotation guidelines were developed and refined iteratively to maximize label quality and model performance. • Developed and documented annotation guidelines for consistency. • Tagged documents with rich metadata to support filtered search and retrieval accuracy. • Used evaluation sets to identify hallucination patterns and improve pipeline quality. • Continuously improved labeling based on model outputs and performance feedback.

2025 - 2025

Structured Data Labeling – IoT Telemetry Anomaly Detection

Classification
I labeled raw IoT telemetry streams with normal or anomaly tags based on domain thresholds, enabling unsupervised anomaly detection model training. Data was cleaned and structured, with missing values handled and noise removed for high-quality input. Feature engineering further ensured that labeled patterns were accurately captured in ML pipelines. • Produced ground truth labels for anomaly detection tasks using domain knowledge. • Engineered features and validated their relevance using labeled data. • Standardized data formatting for ML readiness. • Ensured labeling quality by cross-checking against domain standards.

I labeled raw IoT telemetry streams with normal or anomaly tags based on domain thresholds, enabling unsupervised anomaly detection model training. Data was cleaned and structured, with missing values handled and noise removed for high-quality input. Feature engineering further ensured that labeled patterns were accurately captured in ML pipelines. • Produced ground truth labels for anomaly detection tasks using domain knowledge. • Engineered features and validated their relevance using labeled data. • Standardized data formatting for ML readiness. • Ensured labeling quality by cross-checking against domain standards.

2024 - 2025

Education

S

Sree Rama Engineering College

Bachelor of Technology, Artificial Intelligence and Data Science

Bachelor of Technology
2022

Work History

A

Activist

Open Source Contributor

Tirupati
2025 - 2025
P

Proven Solution

ML Engineer Intern

Tirupati
2024 - 2025