Arif Basha Shaik - LLM Data Annotation Lead (AI Business Assistant/Chatbot Project)

Key Skills

Software

CVAT

Doccano

Labelbox

LabelImg

Label Studio

Roboflow

OpenCV AI Kit (OAK)

Other

Top Subject Matter

LLM Chatbot/Generative AI

Legal Services & Contract Review

Regulatory Compliance & Risk Analysis

Top Data Types

Image

Medical Dicom

Text

Document

Top Task Types

Bounding Box

Classification

Entity Ner Classification

Object Detection

Segmentation

Question Answering

Freelancer Overview

LLM Data Annotation Lead (AI Business Assistant/Chatbot Project). Brings 5+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. AI-training focus includes data types such as Text and labeling workflows including Question Answering.

ExpertEnglish

Labeling Experience

Optical Character Recognition System – Document Segmentation/Annotation

OtherDocumentSegmentation

I developed an end-to-end optical character recognition pipeline for extracting structured data from documents. The pipeline required document data labeling, including table structure detection and segmentation annotation. This effort resulted in improved model performance and increased information extraction accuracy for clinical and unstructured documents. • Document annotation for OCR segmentation • Table structure labeling using CascadeTabNet and PaddleOCR • Data augmentation for enhanced model generalization • Focus on extracting data from clinical documents

2022 - Present

Forklift Safety Monitoring System – Object Detection/Data Annotation

Opencv AI Kit OakTextObject DetectionQuestion Answering

As a Data Scientist, I led the data pipeline design for training and fine-tuning large language models (LLMs) for chatbot applications. This included annotation of question-answer pairs and curating datasets for generative AI use cases. Labeling ensured the reliability and context-awareness of AI-generated responses. • Annotated and validated question-answer datasets for LLM chatbot training • Designed and managed end-to-end data labeling process for RAG-based workflows • Selected and curated documents for knowledge-based training data • Performed quality checks to ensure annotation consistency and accuracy

2022 - Present

Education

J

JNTUK

MTECH, MAchine Design

MTECH

2015 - 2017

Work History

E

Eion Technology

Data Scientist

Hyderabad

2022 - Present

E

EION TECHNOLOGY PVT LTD

Data scientist

Hyderabad

2022 - Present