Vida Ahmadi - Farsi Speech-to-Text Annotation and Transcription Refinement — Data Annotation Project

Key Skills

Software

Other

Label Studio

Top Subject Matter

Farsi Speech Recognition

Legal Services & Contract Review

Regulatory Compliance & Risk Analysis

Top Data Types

Text

Document

Audio

Top Task Types

Entity (NER) Classification

Classification

Text Summarization

Fine-tuning

Computer Programming/Coding

Text Generation

Transcription

Freelancer Overview

Farsi Speech-to-Text Annotation and Transcription Refinement — Data Annotation Project. Core strengths include Label Studio. AI-training focus includes data types such as Audio and labeling workflows including Transcription.

Entry LevelPersian FarsiEnglish

Labeling Experience

Farsi Speech-to-Text Annotation and Transcription Refinement — Data Annotation Project

Label StudioAudioTranscription

In this data annotation project, Farsi audio-text pairs were annotated and reviewed to support AI-driven speech recognition systems. The role included refining machine-generated transcriptions to ensure accuracy, alignment, and linguistic consistency. Annotation adheres to strict guidelines for high-quality dataset outputs. • Conducted manual annotation and review of Farsi audio paired with text for speech recognition training. • Refined and validated automated transcriptions, ensuring linguistic accuracy and correct alignment. • Employed Label Studio as the primary software for segmentation and quality assurance. • Maintained dataset quality by carefully following project-specific annotation standards.

2026 - 2026

Clinical Text Re-Annotation & Entity Alignment (Academic Project)

OtherTextEntity Ner Classification

As part of my Master's thesis on clinical NLP, I experimented with re-annotating portions of the i2b2 2014 medical dataset using MetaMap for automated medical concept extraction. Tasks performed: Ran MetaMap to extract UMLS concepts from clinical narratives Compared extracted entities with existing gold annotations Adjusted entity spans to align with token-level BIO format Cleaned text while preserving character offsets Reviewed boundary mismatches between automated and original annotations Conducted small-scale manual validation to verify alignment accuracy Project Scope: Academic-scale experimentation (subset of clinical narratives) Quality Measures: Manual inspection of entity span mismatches Offset validation to prevent annotation drift Verification of medical concept normalization

2024 - 2024

Education

P

Polytechnic University of Turin

Master of Technology, Data Science and Engineering

Master of Technology

2022 - 2025

K

Kurdistan University

Bachelor of Technology, Information Technology

Bachelor of Technology

2010 - 2014

Work History

Z

Zharfa Company

Python Developer

Kurdistan

2019 - 2021

B

Banta Pardaz Pooyan

Android Developer

Kurdistan

2017 - 2019