Koushik Dey - LLM Evaluation, Coding Tasks, and Multilingual AI Annotator & Transcription

Key Skills

Software

CVAT

Doccano

Labelbox

Prodigy

Appen

Data Annotation Tech

Scale AI

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code Programming

Image

Text

Top Task Types

Bounding Box

Classification

Computer Programming/Coding

Evaluation/Rating

Translation/Localization

Freelancer Overview

An experienced Multilingual AI Data Annotator and LLM Evaluation Specialist with over three years of expertise creating, curating, and validating high-quality training data for NLP, speech-recognition, and STEM learning systems. I’ve labeled and vetted 120,000+ sentences in English, Hindi, and Bengali using Labelbox and Prodigy; developed Python scripts and ETL pipelines to automate QA checks and debug data flows—boosting annotation consistency by 30%; design and execute evaluation protocols for cutting-edge LLMs, crafting prompt-based test suites and scoring outputs against accuracy, bias, and relevance metrics; and deliver 98%+ word-accuracy transcripts for podcasts, interviews, and focus groups as a transcription editor. I author comprehensive annotation guidelines and conduct code reviews in Python, JavaScript, SQL, and Bash. System Administration expertise spans Linux servers, Docker containers, and AWS infrastructure, while as a software and web development specialist, I architect scalable applications with React, Node.js, and Next.js, and write complex SQL queries. I also develop and deliver AI training programs in coding, computer science, and STEM, designing hands-on modules and interactive assessments. Passionate about empowering AI with reliable, multilingual data, I thrive in fast-paced remote settings, continually refining processes, optimizing scripts, and driving measurable gains in model performance.

IntermediateEnglishHindiBengali

Labeling Experience

RLHF Code Annotation & Review

Data Annotation TechComputer Code ProgrammingClassificationComputer Programming Coding

I reviewed 500+ Python, JavaScript, and C++ snippets plus their natural-language instructions for an instruction-following LLM. Tasks included classifying common bug types, flagging insecure patterns, writing cleaner reference solutions, and scoring model outputs for correctness, readability, and style. My annotations fed directly into RLHF fine-tuning and improved compile-success rates by 22 %. Maintained a 99 % audit-pass rate and provided periodic rubric feedback to tooling engineers.

2023

Text Data Labeling Specialist

Scale AITextEntity Ner ClassificationText Generation

Worked on multiple projects focused on text data annotation, including content moderation, intent classification, sentiment analysis, and entity extraction. Responsible for reviewing, labeling, and categorizing large volumes of text data to train and validate machine learning models. Ensured high accuracy by following detailed guidelines and maintaining consistency across tasks. Collaborated with project managers and QA teams to improve labeling standards and deliver quality datasets within tight deadlines.

2024 - 2025

Audio Data Labeler

AppenAudioAudio Recording

Marked speaker turns and non-speech sounds (e.g., laughter, coughs) with precise timestamps on raw audio clips to ensure clean segmentation for downstream tasks. Reviewed and corrected machine-generated transcripts against original recordings, fixing misheard words and aligning text to timecodes per project style guide. Collaborated with the annotation lead to refine labeling instructions, boosting consistency across the dataset and reducing revision cycles.

2024 - 2024

Financial Document OCR and Key-Value Annotation

AppenDocumentEntity Ner ClassificationClassification

Extracted key fields (invoice number, date, vendor, line-item totals) from 15,000+ PDF invoices. Built a dual-review workflow with automated regex checks that drove data accuracy to 97% and cut manual QC time by 40%.

2023 - 2023

Urban Infrastructure Mapping

CVATGeospatial Tiled ImageryPolygonSegmentation

Annotated 5,000 km^2 of tiled satellite imagery to outline road networks, building footprints, and green spaces. Built an automated tiling pipeline to evenly distribute work and a QA script that checked polygon overlaps, boosting annotation throughput by 40% while maintaining 98% mean IoU.

2022 - 2022

Education

V

Vellore Institute of Technology

Master of Computer Science, Computer Science

Master of Computer Science

2022 - 2024

M

Maulana Abul Kalam Azad University of Technology

Bachelor of Computer Science, Computer Science

Bachelor of Computer Science

2018 - 2021

Work History

I

Invisible Technologies

Advanced AI Data Trainer

California

2025 - Present

S

Soul AI

Prompt Engineer

Hyderabad

2025 - Present