For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Dhruv Mittal

Dhruv Mittal

Research Intern in Contract Review, Compliance, and Legal Research

EMIRATE flag
Abu Dhabi, Emirate
$20.00/hrEntry LevelCrowdsourceDatatroniqDatumbox

Key Skills

Software

CrowdSourceCrowdSource
DatatroniqDatatroniq
DatumboxDatumbox
AWS SageMakerAWS SageMaker

Top Subject Matter

Legal Services & Contract Review
Regulatory Compliance & Risk Analysis
Legal Research & Document Analysis

Top Data Types

DocumentDocument
TextText
ImageImage

Top Task Types

Segmentation
Entity Ner Classification

Freelancer Overview

Research Intern in Contract Review, Compliance, and Legal Research. Brings 7+ years of professional experience across complex professional workflows, research, and quality-focused execution. Education includes Bachelor of Medicine and Bachelor of Surgery, Atal Bihari Vajpayee Institute of Medical Sciences (2025) and Bachelor of Medicine and Bachelor of Surgery Internship, Dr. Ram Manohar Lohia Hospital (2025).

Entry LevelHindiFrenchEnglishPunjabi

Labeling Experience

Medical Image Annotation for Diabetic Retinopathy Detection and Grading Using Fundus Photography

ImageSegmentation
Contributed as a Medical Domain Expert and Image Annotation Specialist on a computer vision pipeline aimed at training a deep learning model to automatically detect and grade diabetic retinopathy from fundus photographs. The project was developed to support an AI-assisted screening tool intended for deployment in primary care settings where access to ophthalmologists is limited, enabling earlier detection and referral of high-risk diabetic patients. Specific Data Labeling Tasks Performed: Lesion Segmentation Performed pixel-level segmentation of pathological findings on retinal images, including: Microaneurysms (small red dots indicating early vascular damage) Haemorrhages (flame-shaped or dot/blot type) Hard exudates (bright yellow lipid deposits) Soft exudates / cotton wool spots (pale fluffy lesions indicating ischemia) Neovascularization (abnormal new vessel growth in PDR) Bounding Box Annotation Drew bounding boxes around discrete lesions and anatomical landmarks including: Optic disc location and margins Macula and foveal centre Focal areas of tractional changes or fibrous proliferation Image-level Grading / Classification Assigned overall severity grade to each fundus image using the International Clinical Diabetic Retinopathy Severity Scale: Grade 0: No apparent retinopathy Grade 1: Mild NPDR Grade 2: Moderate NPDR Grade 3: Severe NPDR Grade 4: Proliferative Diabetic Retinopathy (PDR) Diabetic Macular Edema (DME) Flagging Separately classified each image for the presence and severity of DME: No DME DME not involving the foveal centre DME involving the foveal centre (vision-threatening — urgent referral flag) Image Quality Assessment Screened and filtered images prior to annotation for: Adequate illumination and focus Presence of ungradable artefacts (lens opacity, camera glare, poor dilation) Assigned gradability score (Gradable / Partially Gradable / Ungradable) to each image before it entered the annotation pipeline Landmark-based Annotation Placed keypoint markers on standardised anatomical reference points used to normalize image orientation and scale across the dataset, ensuring consistency for model training. Project Size: Total images annotated: 3,800 fundus photographs Total lesion-level annotations: ~52,000 individual segmentation and bounding box instances Team size: 5 annotators (2 with clinical background, 3 trained lay annotators) Duration: 12 weeks Annotation tool used: CVAT (Computer Vision Annotation Tool — open-source) Quality Measures Adhered To: Inter-Annotator Agreement (IAA): Cohen's Kappa score of 0.82 or above maintained for image-level grading; Intersection over Union (IoU) threshold of 0.75 or above required for all segmentation masks Annotation Guidelines: Followed a 30-page clinical annotation schema aligned with the International Clinical Diabetic Retinopathy and DME Disease Severity Scales and NHS Diabetic Eye Screening Programme standards Gold Standard Validation: 8% of images were double-annotated by a senior clinician and used as benchmark references to calibrate annotator accuracy throughout the project Ungradable Image Rate: Kept below 6% of total dataset through systematic quality screening prior to annotation Data Privacy Compliance: All images were fully de-identified and stripped of EXIF metadata in accordance with GDPR and HIPAA Safe Harbor requirements before entering the annotation workflow Audit Trail: All annotation sessions were logged with annotator ID, timestamp, time-on-task per image, and revision history for full traceability and quality review

Contributed as a Medical Domain Expert and Image Annotation Specialist on a computer vision pipeline aimed at training a deep learning model to automatically detect and grade diabetic retinopathy from fundus photographs. The project was developed to support an AI-assisted screening tool intended for deployment in primary care settings where access to ophthalmologists is limited, enabling earlier detection and referral of high-risk diabetic patients. Specific Data Labeling Tasks Performed: Lesion Segmentation Performed pixel-level segmentation of pathological findings on retinal images, including: Microaneurysms (small red dots indicating early vascular damage) Haemorrhages (flame-shaped or dot/blot type) Hard exudates (bright yellow lipid deposits) Soft exudates / cotton wool spots (pale fluffy lesions indicating ischemia) Neovascularization (abnormal new vessel growth in PDR) Bounding Box Annotation Drew bounding boxes around discrete lesions and anatomical landmarks including: Optic disc location and margins Macula and foveal centre Focal areas of tractional changes or fibrous proliferation Image-level Grading / Classification Assigned overall severity grade to each fundus image using the International Clinical Diabetic Retinopathy Severity Scale: Grade 0: No apparent retinopathy Grade 1: Mild NPDR Grade 2: Moderate NPDR Grade 3: Severe NPDR Grade 4: Proliferative Diabetic Retinopathy (PDR) Diabetic Macular Edema (DME) Flagging Separately classified each image for the presence and severity of DME: No DME DME not involving the foveal centre DME involving the foveal centre (vision-threatening — urgent referral flag) Image Quality Assessment Screened and filtered images prior to annotation for: Adequate illumination and focus Presence of ungradable artefacts (lens opacity, camera glare, poor dilation) Assigned gradability score (Gradable / Partially Gradable / Ungradable) to each image before it entered the annotation pipeline Landmark-based Annotation Placed keypoint markers on standardised anatomical reference points used to normalize image orientation and scale across the dataset, ensuring consistency for model training. Project Size: Total images annotated: 3,800 fundus photographs Total lesion-level annotations: ~52,000 individual segmentation and bounding box instances Team size: 5 annotators (2 with clinical background, 3 trained lay annotators) Duration: 12 weeks Annotation tool used: CVAT (Computer Vision Annotation Tool — open-source) Quality Measures Adhered To: Inter-Annotator Agreement (IAA): Cohen's Kappa score of 0.82 or above maintained for image-level grading; Intersection over Union (IoU) threshold of 0.75 or above required for all segmentation masks Annotation Guidelines: Followed a 30-page clinical annotation schema aligned with the International Clinical Diabetic Retinopathy and DME Disease Severity Scales and NHS Diabetic Eye Screening Programme standards Gold Standard Validation: 8% of images were double-annotated by a senior clinician and used as benchmark references to calibrate annotator accuracy throughout the project Ungradable Image Rate: Kept below 6% of total dataset through systematic quality screening prior to annotation Data Privacy Compliance: All images were fully de-identified and stripped of EXIF metadata in accordance with GDPR and HIPAA Safe Harbor requirements before entering the annotation workflow Audit Trail: All annotation sessions were logged with annotator ID, timestamp, time-on-task per image, and revision history for full traceability and quality review

2026 - Present

Clinical NLP Data Annotation for Symptom Extraction and Disease Classification in Pulmonary Conditions

TextEntity Ner Classification
Project Title: Clinical NLP Data Annotation for Symptom Extraction and Disease Classification in Pulmonary Conditions Data Type: Unstructured & Semi-structured Text Data De-identified patient discharge summaries Radiology reports (chest X-ray and HRCT findings) Clinical notes from pulmonology outpatient consultations Labelling Type: Multi-label Text Annotation / Named Entity Recognition (NER) / Classification Subject Matter: Respiratory Medicine / Pulmonology — covering conditions including COPD, Pulmonary Fibrosis, Pneumonia, Pleural Effusion, and Pulmonary Tuberculosis Project Description: Contributed as a Medical Domain Expert and Clinical Data Annotator on a healthcare NLP pipeline aimed at training a machine learning model to automatically extract clinically relevant entities from pulmonary patient records. The project was designed to support a clinical decision-support system that could flag high-risk respiratory patients for early intervention. Specific Data Labeling Tasks Performed: Named Entity Recognition (NER) Tagging Identified and tagged clinical entities within free-text notes, including: Symptoms (e.g., dyspnea, haemoptysis, crepitations) Diagnoses (e.g., COPD exacerbation, community-acquired pneumonia) Medications (e.g., salbutamol, budesonide, azithromycin) Lab values and vitals (e.g., SpO2 88%, FEV1/FVC ratio 0.62) Procedures (e.g., bronchoscopy, pulmonary function test) Relation Extraction Annotation Labelled relationships between entities, for example: "Patient administered salbutamol for acute bronchospasm" → Drug–Indication relationship "HRCT showed bilateral ground-glass opacities consistent with COVID-19 pneumonia" → Finding–Diagnosis relationship Sentence-level Classification Classified individual sentences from clinical notes into predefined categories: Chief Complaint / History of Presenting Illness Past Medical History Examination Finding Investigation Result Impression / Diagnosis Treatment Plan Severity Scoring Labels Applied standardized severity labels to diagnoses based on clinical criteria: COPD: GOLD Stage I–IV Pneumonia: Mild / Moderate / Severe / Critical (based on CURB-65 indicators present in text) Negation and Uncertainty Tagging Flagged negated findings (e.g., "no pleural effusion noted") and uncertain language (e.g., "likely fibrotic changes") to prevent the model from treating absent findings as present — a critical nuance in clinical NLP. Inter-annotator Disagreement Resolution Participated in weekly calibration sessions to resolve conflicting labels between annotators, using adjudication guidelines developed with the clinical lead. Project Size: Total records annotated: 2,400 clinical documents Total annotation instances: ~38,000 individual entity tags Team size: 6 annotators (3 clinical, 3 non-clinical) Duration: 10 weeks Annotation tool used: Label Studio (open-source) Quality Measures Adhered To: Inter-Annotator Agreement (IAA): Cohen's Kappa score maintained at 0.80 or above across all entity categories Annotation Guidelines: Followed a 24-page internal clinical annotation schema developed in alignment with SNOMED CT and ICD-11 terminology Gold Standard Validation: 10% of documents were randomly selected as a gold standard set, reviewed by a senior clinician, and used to benchmark annotator accuracy Error Rate: Individual annotator error rate kept below 5% per review cycle Data Privacy Compliance: All documents were de-identified in accordance with HIPAA Safe Harbor standards prior to annotation Audit Trail: Every label change was logged with timestamp and annotator ID for full traceability

Project Title: Clinical NLP Data Annotation for Symptom Extraction and Disease Classification in Pulmonary Conditions Data Type: Unstructured & Semi-structured Text Data De-identified patient discharge summaries Radiology reports (chest X-ray and HRCT findings) Clinical notes from pulmonology outpatient consultations Labelling Type: Multi-label Text Annotation / Named Entity Recognition (NER) / Classification Subject Matter: Respiratory Medicine / Pulmonology — covering conditions including COPD, Pulmonary Fibrosis, Pneumonia, Pleural Effusion, and Pulmonary Tuberculosis Project Description: Contributed as a Medical Domain Expert and Clinical Data Annotator on a healthcare NLP pipeline aimed at training a machine learning model to automatically extract clinically relevant entities from pulmonary patient records. The project was designed to support a clinical decision-support system that could flag high-risk respiratory patients for early intervention. Specific Data Labeling Tasks Performed: Named Entity Recognition (NER) Tagging Identified and tagged clinical entities within free-text notes, including: Symptoms (e.g., dyspnea, haemoptysis, crepitations) Diagnoses (e.g., COPD exacerbation, community-acquired pneumonia) Medications (e.g., salbutamol, budesonide, azithromycin) Lab values and vitals (e.g., SpO2 88%, FEV1/FVC ratio 0.62) Procedures (e.g., bronchoscopy, pulmonary function test) Relation Extraction Annotation Labelled relationships between entities, for example: "Patient administered salbutamol for acute bronchospasm" → Drug–Indication relationship "HRCT showed bilateral ground-glass opacities consistent with COVID-19 pneumonia" → Finding–Diagnosis relationship Sentence-level Classification Classified individual sentences from clinical notes into predefined categories: Chief Complaint / History of Presenting Illness Past Medical History Examination Finding Investigation Result Impression / Diagnosis Treatment Plan Severity Scoring Labels Applied standardized severity labels to diagnoses based on clinical criteria: COPD: GOLD Stage I–IV Pneumonia: Mild / Moderate / Severe / Critical (based on CURB-65 indicators present in text) Negation and Uncertainty Tagging Flagged negated findings (e.g., "no pleural effusion noted") and uncertain language (e.g., "likely fibrotic changes") to prevent the model from treating absent findings as present — a critical nuance in clinical NLP. Inter-annotator Disagreement Resolution Participated in weekly calibration sessions to resolve conflicting labels between annotators, using adjudication guidelines developed with the clinical lead. Project Size: Total records annotated: 2,400 clinical documents Total annotation instances: ~38,000 individual entity tags Team size: 6 annotators (3 clinical, 3 non-clinical) Duration: 10 weeks Annotation tool used: Label Studio (open-source) Quality Measures Adhered To: Inter-Annotator Agreement (IAA): Cohen's Kappa score maintained at 0.80 or above across all entity categories Annotation Guidelines: Followed a 24-page internal clinical annotation schema developed in alignment with SNOMED CT and ICD-11 terminology Gold Standard Validation: 10% of documents were randomly selected as a gold standard set, reviewed by a senior clinician, and used to benchmark annotator accuracy Error Rate: Individual annotator error rate kept below 5% per review cycle Data Privacy Compliance: All documents were de-identified in accordance with HIPAA Safe Harbor standards prior to annotation Audit Trail: Every label change was logged with timestamp and annotator ID for full traceability

2025 - 2025

Education

S

St. Vincent’s Hospital, University of Melbourne

Clinical Elective, Endocrinology

Clinical Elective
2025 - 2025
D

Dr. Ram Manohar Lohia Hospital

Bachelor of Medicine and Bachelor of Surgery Internship, Medicine and Surgery

Bachelor of Medicine and Bachelor of Surgery Internship
2024 - 2025

Work History

B

Burjeel Medical City

Research Intern

Abu Dhabi
2026 - Present
S

St. Vincent’s Hospital

Clinical Elective, Endocrinology

Melbourne
2025 - 2025