For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Dheeraj Shenoy

Dheeraj Shenoy

AI Training & Data Labeling Expert | LLM Evaluator | NLP & Vision Tasks

India flagBangalore, India
$5.00/hrExpertClickworkerCloudfactoryCrowdflower

Key Skills

Software

ClickworkerClickworker
CloudFactoryCloudFactory
CrowdFlowerCrowdFlower
CrowdSourceCrowdSource
CVATCVAT
Data Annotation TechData Annotation Tech
DataloopDataloop
DatumboxDatumbox
DatasaurDatasaur
DatatureDatature
LabelImgLabelImg
Label StudioLabel Studio
ProdigyProdigy
Redbrick AIRedbrick AI
RemotasksRemotasks
V7 LabsV7 Labs
Internal/Proprietary Tooling
Scale AIScale AI
AppenAppen

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
Geospatial Tiled ImageryGeospatial Tiled Imagery
VideoVideo

Top Task Types

Action Recognition
Audio Recording
Bounding Box
Computer Programming Coding
Evaluation Rating

Freelancer Overview

I am a detail-oriented AI Training and Data Labeling Specialist with hands-on experience working on natural language processing (NLP), computer vision, and LLM evaluation tasks. My background includes contributing to AI projects involving text classification, sentiment analysis, image annotation, and prompt evaluation for large language models. I’ve worked with platforms like Scale.ai, Remotasks, and UHRS to help train, refine, and align cutting-edge AI systems with human judgment and real-world needs. What sets me apart is my ability to consistently deliver high-accuracy results while understanding the context and ethical implications of AI outputs. I'm comfortable working across domains including e-commerce, healthcare, and generative AI, and I bring a strong sense of responsibility, quality control, and adaptability to every task. Whether the goal is better chatbot performance, cleaner data, or more reliable AI outputs, I ensure my contributions lead to smarter and safer models.

ExpertUrduHindiTamilEnglishMarathiSpanish

Labeling Experience

Scale AI

Tamil Video & Audio Transcription Quality Reviewer for AI Dataset

Scale AIVideoClassificationText Generation
Reviewed and annotated Tamil video and audio content for transcription accuracy and quality control. The task involved analyzing AI-generated transcripts for Tamil speech, flagging errors with precise timestamps, and categorizing issues as major, minor, or no error. I also provided written feedback in Tamil and English to help improve speech-to-text models used in virtual assistants and subtitling engines. My role included spotting nuances such as regional accents, slang, contextual errors, and missing words — all while ensuring grammar and meaning were preserved. This helped train high-quality voice recognition and video understanding systems for Tamil-speaking users. I consistently received high ratings for quality and attention to detail.

Reviewed and annotated Tamil video and audio content for transcription accuracy and quality control. The task involved analyzing AI-generated transcripts for Tamil speech, flagging errors with precise timestamps, and categorizing issues as major, minor, or no error. I also provided written feedback in Tamil and English to help improve speech-to-text models used in virtual assistants and subtitling engines. My role included spotting nuances such as regional accents, slang, contextual errors, and missing words — all while ensuring grammar and meaning were preserved. This helped train high-quality voice recognition and video understanding systems for Tamil-speaking users. I consistently received high ratings for quality and attention to detail.

2023 - 2024
CrowdFlower

Freelance Linguistic QA Annotator – Marathi

CrowdflowerAudioEntity Ner ClassificationClassification
Reviewed and corrected transcripts of Marathi audio recordings used for training speech recognition models. Tasks included listening to native Marathi speech clips, evaluating the accuracy of transcriptions, and marking errors as major, minor, or no error using detailed criteria. I added timestamps for each error, classified the nature of the issue (e.g., misheard words, grammar, punctuation), and ensured formatting matched the style guide. This helped improve AI models for Marathi voice recognition and customer support automation. Worked on transcription and QA tasks in Marathi, Hindi, and Tamil for platforms like Appen and UHRS. Evaluated audio quality, provided timestamped error reports, and followed strict formatting rules. Helped enhance AI models for regional voice assistants and speech-to-text systems in multiple Indian languages.

Reviewed and corrected transcripts of Marathi audio recordings used for training speech recognition models. Tasks included listening to native Marathi speech clips, evaluating the accuracy of transcriptions, and marking errors as major, minor, or no error using detailed criteria. I added timestamps for each error, classified the nature of the issue (e.g., misheard words, grammar, punctuation), and ensured formatting matched the style guide. This helped improve AI models for Marathi voice recognition and customer support automation. Worked on transcription and QA tasks in Marathi, Hindi, and Tamil for platforms like Appen and UHRS. Evaluated audio quality, provided timestamped error reports, and followed strict formatting rules. Helped enhance AI models for regional voice assistants and speech-to-text systems in multiple Indian languages.

2022 - 2024
Appen

Hindi Audio Transcription & Cleanup for Speech Recognition Training

AppenAudioClassificationText Generation
Worked on transcription of short Hindi audio clips as part of a dataset used to train speech recognition models. The task involved listening to native Hindi speech (across various dialects) and converting it into clean, standardized text. I removed disfluencies (like fillers and stutters), corrected basic grammatical errors while preserving speaker intent, and applied formatting guidelines for clarity and consistency. Also tagged audio with timestamps and evaluated the quality of AI-generated transcriptions for error classification and feedback. This helped improve the model’s accuracy in real-world use cases like voice assistants and call center automation.

Worked on transcription of short Hindi audio clips as part of a dataset used to train speech recognition models. The task involved listening to native Hindi speech (across various dialects) and converting it into clean, standardized text. I removed disfluencies (like fillers and stutters), corrected basic grammatical errors while preserving speaker intent, and applied formatting guidelines for clarity and consistency. Also tagged audio with timestamps and evaluated the quality of AI-generated transcriptions for error classification and feedback. This helped improve the model’s accuracy in real-world use cases like voice assistants and call center automation.

2022 - 2024
CVAT

E-commerce Product Data Classification and Annotation for NLP Model Training

CVATTextPolygonPolyline
In this project, I was responsible for classifying thousands of product listings into predefined categories and subcategories based on titles, descriptions, and user reviews. The work involved cleaning noisy text data, tagging named entities like brand, size, and color, and summarizing product descriptions to help train a recommendation engine and search relevance model. I also evaluated and rated AI-generated responses (e.g., suggested tags and descriptions) to improve prompt accuracy and model understanding. The work required high attention to detail, strict adherence to quality guidelines, and strong contextual understanding of e-commerce language. I consistently maintained top-tier accuracy scores and helped improve the model's ability to understand buyer intent and product features.

In this project, I was responsible for classifying thousands of product listings into predefined categories and subcategories based on titles, descriptions, and user reviews. The work involved cleaning noisy text data, tagging named entities like brand, size, and color, and summarizing product descriptions to help train a recommendation engine and search relevance model. I also evaluated and rated AI-generated responses (e.g., suggested tags and descriptions) to improve prompt accuracy and model understanding. The work required high attention to detail, strict adherence to quality guidelines, and strong contextual understanding of e-commerce language. I consistently maintained top-tier accuracy scores and helped improve the model's ability to understand buyer intent and product features.

2022 - 2024

Education

Y

Yuva Shakthi Education Society And Millennium Software Solutions

Certificate, Hardware & Networking

Certificate
2014 - 2018

Work History

I

iMerit

Pod Lead in Data Annotation

Bangalore
2022 - Present
I

iMerit

Pod Lead in Data Annotation

Bangalore
2022 - Present