Romit Dubey - Data Processing and Text Classification for Digitization (IIT Bombay Internship)

Key Skills

Software

LabelImg

Roboflow

SuperAnnotate

Top Subject Matter

Historical Manuscript Digitization and Information Retrieval

Top Data Types

Document

Text

Image

Top Task Types

Classification

Bounding Box

Object Detection

Computer Programming Coding

Freelancer Overview

Data Processing and Text Classification for Digitization (IIT Bombay Internship). Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Master of Computer Application, Banaras Hindu University (2024). AI-training focus includes data types such as Document and labeling workflows including Classification.

IntermediateEnglish

Labeling Experience

Data Processing and Text Classification for Digitization (IIT Bombay Internship)

DocumentClassification

Built and automated OCR workflows to extract, clean, and structure text from large volumes of scanned historical manuscripts. Applied data preprocessing and normalization to create datasets suitable for downstream AI applications. Validated and processed extracted content to ensure high accuracy for research purposes. • Created machine-readable datasets from noisy document sources • Automated pipeline for text digitization using Python-based tools • Performed quality checks on labeled data to meet research standards • Enabled search and retrieval on digitized content

2022 - 2023

Education

B

Banaras Hindu University

Master of Computer Application, Computer Application

Master of Computer Application

2024

Work History

I

Indian Institute of Technology

Software Development Intern

Mumbai

2022 - 2023