For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Dumindu Weerasinghe

Dumindu Weerasinghe

Technical AI Subject Matter Expert | Software Engineering & Statistics

Sri Lanka flagColombo, Sri Lanka
$15.00/hrEntry LevelOtherInternal Proprietary Tooling

Key Skills

Software

Other
Internal/Proprietary Tooling

Top Subject Matter

Computer Science & Software Development (Python, Java, SQL)
Quantitative Finance & Advanced Statistics
Legal & Medical Documentation Analysis

Top Data Types

DocumentDocument
TextText
ImageImage

Top Task Types

Entity Ner Classification
Text Generation
Classification
Fine Tuning
Text Summarization
Computer Programming Coding
Data Collection

Freelancer Overview

I am a technical Subject Matter Expert with a dual-degree background in Software Engineering (B.Eng.) and Industrial Statistics & Financial Mathematics (B.Sc.). I specialize in high-precision data curation and the evaluation of complex logical, mathematical, and programming datasets. My experience includes developing AI-driven tools using LLMs like Google Gemini and Mistral AI, providing me with firsthand insight into prompt engineering and model output validation. Whether it involves labeling sophisticated code snippets in Java, Python, and SQL or verifying complex statistical proofs and time-series data (ARIMA/GARCH), I bring the analytical rigor of a statistician and the technical proficiency of an engineer to ensure 100% data integrity. I have a proven track record in large-scale data processing and quality control. During my tenure at Hithertech LTD, I designed automated data validation frameworks that reduced system error rates by 18% for over 1,000 daily records. Furthermore, for my "Detoxium" project, I engineered custom Python web scrapers to collect specialized data from 50+ independent sources, improving domain data accuracy by 40% through meticulous cleaning and validation. My background in AIESEC leadership ensures that I am a highly reliable professional who can follow complex annotation guidelines with extreme attention to detail, making me an ideal candidate for high-level RLHF (Reinforcement Learning from Human Feedback) and technical labeling tasks.

Entry LevelEnglishSinhaleseTamil

Labeling Experience

Clinical Feature Labeling & Data Validation – Survival Prediction System

Medical DicomClassification
I worked with a dataset of clinical medical records to develop a predictive system for patient survival. My role involved data preprocessing and feature labeling for 12+ clinical variables. I performed rigorous 5-fold cross-validation to ensure the "ground truth" labels were consistent across the dataset. By identifying and correcting mislabeled or outlier data points, I enabled the machine learning algorithms (Logistic Regression, KNN) to achieve an 87% accuracy rate in predicting patient outcomes.

I worked with a dataset of clinical medical records to develop a predictive system for patient survival. My role involved data preprocessing and feature labeling for 12+ clinical variables. I performed rigorous 5-fold cross-validation to ensure the "ground truth" labels were consistent across the dataset. By identifying and correcting mislabeled or outlier data points, I enabled the machine learning algorithms (Logistic Regression, KNN) to achieve an 87% accuracy rate in predicting patient outcomes.

2026 - 2026

AI-Powered Document Analysis Pipeline Labeling – Contract Shield Personal Project

OtherDocumentEntity Ner Classification
I managed the data pipeline for an AI-powered legal document analyzer. This involved manually annotating and "ground-truthing" thousands of legal clauses to train the Google Gemini API for high-precision extraction. I classified entities such as "Termination Clauses," "Liability Limits," and "Effective Dates." Through rigorous manual verification and iterative testing, I achieved 98% accuracy in clause extraction, ensuring the model could identify complex legal language with minimal hallucination.

I managed the data pipeline for an AI-powered legal document analyzer. This involved manually annotating and "ground-truthing" thousands of legal clauses to train the Google Gemini API for high-precision extraction. I classified entities such as "Termination Clauses," "Liability Limits," and "Effective Dates." Through rigorous manual verification and iterative testing, I achieved 98% accuracy in clause extraction, ensuring the model could identify complex legal language with minimal hallucination.

2026 - 2026

Conversational AI Training and Intent Annotation – Detoxium Project

OtherTextText Generation
For the Detoxium AI recovery assistant, I performed end-to-end data curation for a specialized healthcare chatbot. I wrote custom Python scrapers to collect raw data from 50+ independent sources. I then manually labeled and categorized user intents to train the Mistral AI model to distinguish between medical inquiries, crisis support requests, and general information. I implemented meticulous data cleaning and validation protocols that improved domain data accuracy by 40%, ensuring the AI provided safe and context-aware responses for sensitive healthcare data.

For the Detoxium AI recovery assistant, I performed end-to-end data curation for a specialized healthcare chatbot. I wrote custom Python scrapers to collect raw data from 50+ independent sources. I then manually labeled and categorized user intents to train the Mistral AI model to distinguish between medical inquiries, crisis support requests, and general information. I implemented meticulous data cleaning and validation protocols that improved domain data accuracy by 40%, ensuring the AI provided safe and context-aware responses for sensitive healthcare data.

2024 - 2025

Education

U

University of Colombo

Bachelor of Science with Honors, Industrial Statistics and Financial Mathematics

Bachelor of Science with Honors
2024
U

University of Westminster

Bachelor of Engineering with Honors, Software Engineering

Bachelor of Engineering with Honors
2023

Work History

H

Hithertech

Software Engineer Intern

London
2025 - Present