Luca Gimbo - AI Training & Language Quality Specialist - Technology & Internet

Key Skills

Software

Don't disclose

Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

Image

Text

Video

Audio

Top Task Types

Entity Ner Classification

Evaluation Rating

Fine Tuning

RLHF

Transcription

Freelancer Overview

I am an Italian native language specialist with hands-on experience in AI training data, data labeling, and linguistic quality evaluation. My work focuses on reviewing and refining Italian language datasets for semantic accuracy, syntactic correctness, and cultural alignment, ensuring high-quality inputs for AI models. I have evaluated both AI-generated and human-produced content, provided structured annotations, and conducted image-based data assessments with a strong eye for contextual and cultural nuances. My academic background in discourse analysis and strong English proficiency enable me to operate effectively in multilingual environments. Notably, I have contributed to advanced projects in NLP and data extraction, including building automated Python workflows for large-scale data scraping and annotation, orchestrating multi-agent LLM pipelines, and developing rigorous data validation systems. I thrive in remote, quality-driven settings and am committed to delivering precise, reliable training data for AI applications.

IntermediateGermanEnglishItalianSpanish

Labeling Experience

Project Echo

Internal Proprietary ToolingAudioTranscription

The project focused on improving the performance of AI speech recognition models in Italian by producing high-quality, human-validated audio-to-text training data for supervised learning and RLHF pipelines. The work involved listening to Italian audio clips and editing model-generated transcripts to ensure exact alignment with spoken content, with emphasis on semantic accuracy, linguistic correctness, and meaning preservation. Tasks included correcting transcription errors, applying proper punctuation, transcribing filler words and disfluencies, marking overlapping speech when present, and flagging clips in incorrect or unrecognized languages according to strict guidelines. The project operated at scale across more than 30 languages, with the Italian-language component running continuously for approximately four months and still ongoing.

2025

Video quality comparison

Don T DiscloseVideoEvaluation Rating

Conducted side-by-side comparison of AI-generated video outputs against the original source, evaluating which version most closely matched the original in terms of accuracy and fidelity, rather than production quality.

2025 - 2025

Project Diamond

Don T DiscloseTextEvaluation Rating

Reviewed and evaluated LLM-generated explanations of structured visuals such as graphs, charts, and diagrams. Assessed the accuracy, clarity, and completeness of responses, highlighting trends and key aspects, and provided corrections to refine outputs until they met high-quality standards.

2025 - 2025

Project Diamond

Don T DiscloseImageEntity Ner Classification

Performed detailed annotation by analyzing Instagram profile images and tagging visible entities using tools such as Google Lens. Entities included clothing, landmarks, locations, signage, and consumer products, ensuring accurate and comprehensive metadata for AI training.

2025 - 2025

Education

U

University of Bologna

Bachelor of Arts, Political Science and Civics

Bachelor of Arts

2022 - 2025

Work History

R

Ristorante le due marie

Waiter

Catania

2016 - 2020