Aamir Ansari - LLM & Data Annotation Specialist | Text, Image & Video Labeling

Key Skills

Software

AWS SageMaker

Appen

Axiom AI

CloudFactory

Google Cloud Vertex AI

OpenCV AI Kit (OAK)

Scale AI

Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

Audio

Computer Code Programming

Document

Top Task Types

Computer Programming Coding

Data Collection

Function Calling

Mapping

Object Detection

Freelancer Overview

I have experience working with AI-driven applications that rely heavily on high quality training data, structured datasets, and accurate data interpretation. In my roles as a Python and AI Developer, I have worked with NLP, LLMs, and RAG-based systems where data preprocessing, annotation consistency, and evaluation quality were central to the success of the models. I have handled tasks like intent classification, entity extraction, structured output validation, and preparing datasets for chatbot training and generative AI workflows. I’m comfortable working with both text-based and tabular data, including cleaning, labeling, validating, and refining datasets to improve model accuracy. My background in building AI chatbots, data processing pipelines, and scalable backend systems has given me a practical understanding of how quality training data directly impacts performance. I approach labeling work carefully and systematically, focusing on clarity, accuracy, and adherence to guidelines.

ExpertHindiArabicEnglish

Labeling Experience

AI Chatbot Training Data Labeling & Intent Classification

AppenTextEntity Ner Classification

I worked on building and refining high-quality training data for a custom AI chatbot system. The project involved classifying user intents, extracting entities, and generating structured query and response pairs for RAG-based dialogue flows. I also performed manual annotation of conversation samples to improve natural language understanding and contextual accuracy. Data records were reviewed for consistency, corrected for ambiguity, and validated against project labeling guidelines. The training dataset supported multilingual usage and covered varied conversational scenarios such as customer queries, troubleshooting steps, and knowledge-base retrieval workflows. Quality standards followed: Cross-reviewed annotations for consistency Maintained version-controlled datasets Used clear label taxonomies and annotation rules Performed periodic evaluation and refinement based on model feedback

2022

Education

K

Kalinga University

Bachelor of Computer Application, Computer Application

Bachelor of Computer Application

2019 - 2022

Work History

X

Xccelerance Technology

AI Software Developer

Indore

2025 - Present

C

Closeloop Technology Pvt Ltd

Python Developer

Mohali

2024 - 2025