Rohit Pawar - Data Scientist – Intelligent Document Processing/Entity Extraction

Key Skills

Software

OpenCV AI Kit (OAK)

Data Annotation Tech

Top Subject Matter

Intelligent Document Processing and Automation

Text Classification and Attribute Extraction

Top Data Types

Document

Text

Image

Top Task Types

Entity (NER) Classification

Classification

Bounding Box

Segmentation

Object Detection

Computer Programming/Coding

Fine-tuning

Question Answering

Text Generation

Freelancer Overview

Data Scientist – Intelligent Document Processing/Entity Extraction. Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Master of Technology, Indian Institute of Technology, Guwahati (2023) and Bachelor of Engineering, University Institute of Technology, RGPV, Bhopal (2018). AI-training focus includes data types such as Document and Text and labeling workflows including Entity (NER) Classification and Classification.

IntermediateEnglishHindi

Labeling Experience

Data Scientist – Intelligent Document Processing/Entity Extraction

DocumentEntity Ner Classification

Led the development of a document processing system for automated extraction of structured information from large-scale document datasets. Designed and implemented pipelines for document classification, key-value extraction, and rule-based data transformation. Developed LLM-powered modules to extract entities and attributes from unstructured documents. • Performed large-scale document ingestion and classification using custom models. • Built information extraction modules utilizing LLMs for entity extraction from documents. • Generated structured outputs in JSON and Excel to facilitate downstream analytics. • Developed automation workflows for document labeling, processing, and validation.

2025 - Present

Data Scientist – Text Data Labeling and Annotation

TextClassification

Developed an end-to-end pipeline for extreme multi-label text classification involving data preprocessing and annotation of large-scale textual datasets. Improved model performance through iterative feature optimization and supported robust attribute extraction using automated ETL pipelines. Collaborated with teams to ensure reliable data labeling and structured data generation for training ML models. • Designed workflows for attribute annotation in text data. • Created validation scripts to maintain data labeling quality standards. • Performed multi-label classification on large document corpora. • Supported integration of ML models trained with labeled data into operational pipelines.

2023 - 2025

Education

I

Indian Institute of Technology, Guwahati

Master of Technology, Electrical Engineering

Master of Technology

2021 - 2023

U

University Institute of Technology, RGPV, Bhopal

Bachelor of Engineering, Engineering

Bachelor of Engineering

2014 - 2018

Work History

L

Livegage Technology

Data Scientist

Indore

2025 - Present

S

Straive

Data Scientist

Chennai

2023 - 2025