For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Arif Basha Shaik

Arif Basha Shaik

LLM Data Annotation Lead (AI Business Assistant/Chatbot Project)

India flagHyderabad, India
$18.00/hrExpertCVATDoccanoLabelbox

Key Skills

Software

CVATCVAT
DoccanoDoccano
LabelboxLabelbox
LabelImgLabelImg
Label StudioLabel Studio
RoboflowRoboflow
OpenCV AI Kit (OAK)OpenCV AI Kit (OAK)
Other

Top Subject Matter

LLM Chatbot/Generative AI
Legal Services & Contract Review
Regulatory Compliance & Risk Analysis

Top Data Types

ImageImage
Medical DicomMedical Dicom
TextText
DocumentDocument

Top Task Types

Bounding Box
Classification
Entity Ner Classification
Object Detection
Segmentation
Question Answering

Freelancer Overview

LLM Data Annotation Lead (AI Business Assistant/Chatbot Project). Brings 5+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. AI-training focus includes data types such as Text and labeling workflows including Question Answering.

ExpertEnglish

Labeling Experience

Optical Character Recognition System – Document Segmentation/Annotation

OtherDocumentSegmentation
I developed an end-to-end optical character recognition pipeline for extracting structured data from documents. The pipeline required document data labeling, including table structure detection and segmentation annotation. This effort resulted in improved model performance and increased information extraction accuracy for clinical and unstructured documents. • Document annotation for OCR segmentation • Table structure labeling using CascadeTabNet and PaddleOCR • Data augmentation for enhanced model generalization • Focus on extracting data from clinical documents

I developed an end-to-end optical character recognition pipeline for extracting structured data from documents. The pipeline required document data labeling, including table structure detection and segmentation annotation. This effort resulted in improved model performance and increased information extraction accuracy for clinical and unstructured documents. • Document annotation for OCR segmentation • Table structure labeling using CascadeTabNet and PaddleOCR • Data augmentation for enhanced model generalization • Focus on extracting data from clinical documents

2022 - Present
OpenCV AI Kit (OAK)

Forklift Safety Monitoring System – Object Detection/Data Annotation

Opencv AI Kit OakTextObject DetectionQuestion Answering
As a Data Scientist, I led the data pipeline design for training and fine-tuning large language models (LLMs) for chatbot applications. This included annotation of question-answer pairs and curating datasets for generative AI use cases. Labeling ensured the reliability and context-awareness of AI-generated responses. • Annotated and validated question-answer datasets for LLM chatbot training • Designed and managed end-to-end data labeling process for RAG-based workflows • Selected and curated documents for knowledge-based training data • Performed quality checks to ensure annotation consistency and accuracy

As a Data Scientist, I led the data pipeline design for training and fine-tuning large language models (LLMs) for chatbot applications. This included annotation of question-answer pairs and curating datasets for generative AI use cases. Labeling ensured the reliability and context-awareness of AI-generated responses. • Annotated and validated question-answer datasets for LLM chatbot training • Designed and managed end-to-end data labeling process for RAG-based workflows • Selected and curated documents for knowledge-based training data • Performed quality checks to ensure annotation consistency and accuracy

2022 - Present

Education

J

JNTUK

MTECH, MAchine Design

MTECH
2015 - 2017

Work History

E

Eion Technology

Data Scientist

Hyderabad
2022 - Present
E

EION TECHNOLOGY PVT LTD

Data scientist

Hyderabad
2022 - Present