For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
R
Rohit Pawar

Rohit Pawar

Data Scientist – Intelligent Document Processing/Entity Extraction

India flagBhopal, India
$25.00/hrIntermediateOpencv AI Kit OakData Annotation Tech

Key Skills

Software

OpenCV AI Kit (OAK)OpenCV AI Kit (OAK)
Data Annotation TechData Annotation Tech

Top Subject Matter

Intelligent Document Processing and Automation
Text Classification and Attribute Extraction

Top Data Types

DocumentDocument
TextText
ImageImage

Top Task Types

Entity (NER) ClassificationEntity (NER) Classification
ClassificationClassification
Bounding BoxBounding Box
SegmentationSegmentation
Object DetectionObject Detection
Computer Programming/CodingComputer Programming/Coding
Fine-tuningFine-tuning
Question AnsweringQuestion Answering
Text GenerationText Generation

Freelancer Overview

Data Scientist – Intelligent Document Processing/Entity Extraction. Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Master of Technology, Indian Institute of Technology, Guwahati (2023) and Bachelor of Engineering, University Institute of Technology, RGPV, Bhopal (2018). AI-training focus includes data types such as Document and Text and labeling workflows including Entity (NER) Classification and Classification.

IntermediateEnglishHindi

Labeling Experience

Data Scientist – Intelligent Document Processing/Entity Extraction

DocumentEntity Ner Classification
Led the development of a document processing system for automated extraction of structured information from large-scale document datasets. Designed and implemented pipelines for document classification, key-value extraction, and rule-based data transformation. Developed LLM-powered modules to extract entities and attributes from unstructured documents. • Performed large-scale document ingestion and classification using custom models. • Built information extraction modules utilizing LLMs for entity extraction from documents. • Generated structured outputs in JSON and Excel to facilitate downstream analytics. • Developed automation workflows for document labeling, processing, and validation.

Led the development of a document processing system for automated extraction of structured information from large-scale document datasets. Designed and implemented pipelines for document classification, key-value extraction, and rule-based data transformation. Developed LLM-powered modules to extract entities and attributes from unstructured documents. • Performed large-scale document ingestion and classification using custom models. • Built information extraction modules utilizing LLMs for entity extraction from documents. • Generated structured outputs in JSON and Excel to facilitate downstream analytics. • Developed automation workflows for document labeling, processing, and validation.

2025 - Present

Data Scientist – Text Data Labeling and Annotation

TextClassification
Developed an end-to-end pipeline for extreme multi-label text classification involving data preprocessing and annotation of large-scale textual datasets. Improved model performance through iterative feature optimization and supported robust attribute extraction using automated ETL pipelines. Collaborated with teams to ensure reliable data labeling and structured data generation for training ML models. • Designed workflows for attribute annotation in text data. • Created validation scripts to maintain data labeling quality standards. • Performed multi-label classification on large document corpora. • Supported integration of ML models trained with labeled data into operational pipelines.

Developed an end-to-end pipeline for extreme multi-label text classification involving data preprocessing and annotation of large-scale textual datasets. Improved model performance through iterative feature optimization and supported robust attribute extraction using automated ETL pipelines. Collaborated with teams to ensure reliable data labeling and structured data generation for training ML models. • Designed workflows for attribute annotation in text data. • Created validation scripts to maintain data labeling quality standards. • Performed multi-label classification on large document corpora. • Supported integration of ML models trained with labeled data into operational pipelines.

2023 - 2025

Education

I

Indian Institute of Technology, Guwahati

Master of Technology, Electrical Engineering

Master of Technology
2021 - 2023
U

University Institute of Technology, RGPV, Bhopal

Bachelor of Engineering, Engineering

Bachelor of Engineering
2014 - 2018

Work History

L

Livegage Technology

Data Scientist

Indore
2025 - Present
S

Straive

Data Scientist

Chennai
2023 - 2025