Lalith Sai - AI Training Data – RAG Evaluation Dataset

Key Skills

Software

LabelImg

Img Lab

Humanatic

Google Cloud Vertex AI

Dataloop

Data Annotation Tech

Snorkel AI

Top Subject Matter

LLM Evaluation

RAG Chatbot

Plant Disease Classification

Top Data Types

Text

Image

Computer Code Programming

Top Task Types

Classification

Fine-tuning

Data Collection

Computer Programming/Coding

Object Detection

Freelancer Overview

AI Training Data – RAG Evaluation Dataset. Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal, Proprietary Tooling, and OpenCV. Education includes Bachelor of Technology, Sree Rama Engineering College (2022). AI-training focus includes data types such as Text, Image, and Structured and labeling workflows including Evaluation, Rating, and Classification.

IntermediateEnglishTelugu

Labeling Experience

Image Dataset Annotation – Plant Disease Classification

ImageClassification

I annotated and preprocessed a multi-class image dataset of crop leaves for a plant disease classification project. Labels for healthy, diseased, and specific disease types were applied consistently, with rigorous quality controls in place. Image augmentation techniques expanded the labeled set to support model generalization and improve accuracy. • Maintained high dataset integrity by removing poor-quality or ambiguous images. • Utilized OpenCV and PyTorch for data augmentation and preprocessing. • Validated annotation consistency before model training and testing. • Demonstrated the impact of label quality on computer vision model results.

2025 - 2025

AI Training Data – RAG Evaluation Dataset

Text

This role focused on creating a robust Q&A evaluation dataset for an LLM-powered retrieval system. I manually crafted and annotated over 250 Q&A pairs to test diverse retrieval scenarios, tagging source documents by topic, type, and relevance. Annotation guidelines were developed and refined iteratively to maximize label quality and model performance. • Developed and documented annotation guidelines for consistency. • Tagged documents with rich metadata to support filtered search and retrieval accuracy. • Used evaluation sets to identify hallucination patterns and improve pipeline quality. • Continuously improved labeling based on model outputs and performance feedback.

2025 - 2025

Structured Data Labeling – IoT Telemetry Anomaly Detection

Classification

I labeled raw IoT telemetry streams with normal or anomaly tags based on domain thresholds, enabling unsupervised anomaly detection model training. Data was cleaned and structured, with missing values handled and noise removed for high-quality input. Feature engineering further ensured that labeled patterns were accurately captured in ML pipelines. • Produced ground truth labels for anomaly detection tasks using domain knowledge. • Engineered features and validated their relevance using labeled data. • Standardized data formatting for ML readiness. • Ensured labeling quality by cross-checking against domain standards.

2024 - 2025

Education

S

Sree Rama Engineering College

Bachelor of Technology, Artificial Intelligence and Data Science

Bachelor of Technology

2022

Work History

A

Activist

Open Source Contributor

Tirupati

2025 - 2025

P

Proven Solution

ML Engineer Intern

Tirupati

2024 - 2025