Thushanthiga Sivalingam - AI Data Annotation Specialist

Key Skills

Software

Scale AI

CVAT

Other

Label Studio

Top Subject Matter

Nlp Domain Expertise

Sentiment Analysis

Intent Classification

Top Data Types

Text

Image

Document

Top Task Types

Entity Ner Classification

Bounding Box

Data Collection

Classification

RLHF

Freelancer Overview

AI Data Annotation Specialist. Brings 3+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Scale AI, CVAT, and Internal. Education includes Master of Science, Sri Lanka Institute of Information Technology (2024) and Bachelor of Science, University of Colombo (2023). AI-training focus includes data types such as Text and Image and labeling workflows including Entity (NER) Classification, Bounding Box, and Data Collection.

IntermediateTamilEnglish

Labeling Experience

Instruction Dataset Curator/Annotator

Label StudioTextRLHF

I contributed to the creation and curation of instruction-tuning datasets for LLM fine-tuning, generating high-quality prompt-response pairs with RLHF preference annotations. I integrated inter-annotator agreement metrics and automatic rejection for low-consensus batches. I ensured the datasets met rigorous quality standards used in production and academic research. • Produced 12,000+ prompt-response pairs with RLHF annotation • Implemented Cohen’s Kappa agreement for annotator quality control • Developed and utilized human-in-loop interfaces for review • Supported LLM fine-tuning pipelines with quality-checked data

2024 - 2024

Junior AI Engineer (Data Annotation)

TextData Collection

As a Junior AI Engineer, I collaborated on training data pipeline construction including data collection strategy, schema design, and label validation workflows. My responsibilities included preparing labeled evaluation datasets for NLP model benchmarking. I ensured the integrity and quality of annotated data used in model development. • Designed and maintained schema for annotation projects • Performed data collection and curation for NLP applications • Conducted label validation for text evaluation datasets • Supported R&D for text classification, NER, and semantic similarity tasks

2024 - 2024

Sign Language Dataset Creator/Annotator

OtherImageClassification

I collected, annotated, and validated over 8,000 gesture images to create a training dataset for sign language recognition. The work included custom augmentation pipelines and manual QA to ensure dataset fidelity. I supported the open-source release of a CNN-LSTM hybrid model for real-time ASL gesture recognition. • Created and annotated gesture image dataset from scratch • Used custom augmentation for dataset expansion and diversity • Conducted manual QA and validation on annotated image sets • Enabled research and deployment for ASL recognition system

2023 - 2023

AI Data Annotation Specialist

CVATImageBounding Box

In my role as an AI Data Annotation Specialist, I performed image and bounding box annotation for object detection and segmentation projects using CVAT and Label Studio. The data domains included retail, medical imaging, and autonomous vehicle imagery. I validated labels for accuracy and collaborated with a team to reduce labeling errors through QA checklists. • Annotated thousands of images for object detection and segmentation • Used CVAT and Label Studio to label bounding boxes in diverse domains • Collaborated with QA team to implement error-reduction workflows • Supported medical, retail, and autonomous vehicle projects

2023 - 2023

AI Data Annotation Specialist

Scale AITextEntity Ner Classification

As an AI Data Annotation Specialist, I annotated and labelled over 50,000 text samples for NLP model training, including sentiment analysis, intent classification, NER, and dialogue act categorisation. I curated and cleaned LLM instruction fine-tuning datasets, including RLHF preference pairs, and developed annotation guidelines for team adoption. I performed quality assurance checks and created documentation to ensure consistent labeling standards. • Annotated 50,000+ text samples for classification and NER tasks • Curated RLHF preference pairs and prompt-response quality scoring datasets • Developed and formalized annotation and QA guidelines for team of 6 • Used Label Studio, Labelbox, Scale AI, and Prodigy for dataset validation and labeling

2023 - 2023

Education

U

University of Colombo

Bachelor of Science, Physical Science

Bachelor of Science

2020 - 2023

S

Sri Lanka Institute of Information Technology

Master of Science, Artificial Intelligence

Master of Science

2024

Work History

A

Apptimus Tech

AI Engineer

Colombo

2025 - Present

A

Apptimus Tech

Associate AI Engineer

Colombo

2024 - 2025