For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Usman Abdulsalam

Usman Abdulsalam

Singing Voice Corpus Annotator — OpenTrain AI — Freelance

NIGERIA flag
kaduna, Nigeria
$17.00/hrIntermediateOtherAppenArgilla

Key Skills

Software

Other
AppenAppen
ArgillaArgilla
CVATCVAT
CrowdSourceCrowdSource
Axiom AI
Scale AIScale AI
TolokaToloka
Internal/Proprietary Tooling

Top Subject Matter

Singing Voice Corpus Annotation for AI Synthesis
Large Language Model (LLM) Agent Evaluation
Singing Voice Dataset for AI Model Training

Top Data Types

AudioAudio
TextText
VideoVideo
DocumentDocument

Top Task Types

Transcription
Data Collection
Red Teaming
Prompt Response Writing SFT
Segmentation
Polygon
Bounding Box
Fine Tuning
RLHF
Question Answering
Evaluation Rating
Text Summarization
Text Generation
Cuboid
Point Key Point
Computer Programming Coding
Function Calling
Entity Ner Classification
Polyline

Freelancer Overview

Singing Voice Corpus Annotation. Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Other. Education includes Bachelor of Engineering, Ahmadu Bello University, Zaria (2023) and Diploma in Computer Engineering, Ahmadu Bello University, Zaria (2016). AI-training focus includes data types such as Audio and Text and labeling workflows including Transcription, Evaluation, and Rating.

IntermediateHausaYorubaEnglish

Labeling Experience

Founder & Lead Engineer (Deaf-Tech Dataset & Labeling)

OtherTextData Collection
I coordinated the creation and validation of a proprietary Hausa-to-Nigerian Sign Language dictionary dataset. I managed collaboration with NSL interpreters and Deaf academies to build and annotate a verified sign language video dataset. The structured dataset held significant value for assistive technology and was used in AI models for translation between Hausa speech and NSL. • Oversaw annotation workflows with domain experts and end-users. • Validated sign accuracy through expert review and community feedback. • Dataset supported model training for real-time Hausa speech-to-sign translation. • Custom labeling and dictionary building for low-resource, regional language-to-sign mapping.

I coordinated the creation and validation of a proprietary Hausa-to-Nigerian Sign Language dictionary dataset. I managed collaboration with NSL interpreters and Deaf academies to build and annotate a verified sign language video dataset. The structured dataset held significant value for assistive technology and was used in AI models for translation between Hausa speech and NSL. • Oversaw annotation workflows with domain experts and end-users. • Validated sign accuracy through expert review and community feedback. • Dataset supported model training for real-time Hausa speech-to-sign translation. • Custom labeling and dictionary building for low-resource, regional language-to-sign mapping.

2025 - Present

AI Data Specialist (RLHF/Code Evaluation)

OtherTextRLHF
I evaluated Python code and AI agent responses as part of reinforcement learning from human feedback (RLHF) workflows for large language models. I focused on correctness, coding standards, and security, contributing scores and feedback for use in RLHF fine-tuning pipelines. Hybrid workflows leveraged AI agents for repetitive extraction while I performed quality control and complex judgment. • Reviewed code for correctness, PEP 8 style, and security flaws. • Scored and rated agent responses for RLHF model training. • Supported LLM companies in building high-quality code/data evaluation sets. • Participated in hybrid AI-human annotation and review processes.

I evaluated Python code and AI agent responses as part of reinforcement learning from human feedback (RLHF) workflows for large language models. I focused on correctness, coding standards, and security, contributing scores and feedback for use in RLHF fine-tuning pipelines. Hybrid workflows leveraged AI agents for repetitive extraction while I performed quality control and complex judgment. • Reviewed code for correctness, PEP 8 style, and security flaws. • Scored and rated agent responses for RLHF model training. • Supported LLM companies in building high-quality code/data evaluation sets. • Participated in hybrid AI-human annotation and review processes.

2025 - Present

Web Data Extraction and Automation

OtherTextData Collection
I built complex Python web scraping workflows to extract structured text data suitable for AI model training and evaluation. I integrated tools for both cloud-based deployment and LLM-assisted data normalization and extraction. I delivered validated, clean datasets in CSV and JSON formats for downstream AI applications and labeling pipelines. • Leveraged BeautifulSoup, Selenium, Playwright, Apify, and OpenRouter in workflow. • Ensured full cycle from data extraction to cleanliness and normalization. • Produced datasets for NLP, search, and mapping AI agents. • Datasets were fully validated for quality control.

I built complex Python web scraping workflows to extract structured text data suitable for AI model training and evaluation. I integrated tools for both cloud-based deployment and LLM-assisted data normalization and extraction. I delivered validated, clean datasets in CSV and JSON formats for downstream AI applications and labeling pipelines. • Leveraged BeautifulSoup, Selenium, Playwright, Apify, and OpenRouter in workflow. • Ensured full cycle from data extraction to cleanliness and normalization. • Produced datasets for NLP, search, and mapping AI agents. • Datasets were fully validated for quality control.

2025 - Present

LLM Agent Evaluation Scenario Writer

OtherText
I designed structured evaluation scenarios for LLM-based AI agents simulating practical use cases. I defined golden path behaviors, edge cases, and scoring rubrics in JSON and YAML formats for agent behavior assessment. I reviewed agent outputs and iterated on scenarios to ensure coverage and clarity for fine-tuning and reinforcement learning purposes. • Created multi-turn scenarios including calendar, email, maps, and productivity app simulations. • Documented acceptable behaviors and edge case handling for real-world coverage. • Used JSON/YAML as scenario definition and labeling format. • Work contributed directly to structured LLM agent evaluation and RLHF data pipelines.

I designed structured evaluation scenarios for LLM-based AI agents simulating practical use cases. I defined golden path behaviors, edge cases, and scoring rubrics in JSON and YAML formats for agent behavior assessment. I reviewed agent outputs and iterated on scenarios to ensure coverage and clarity for fine-tuning and reinforcement learning purposes. • Created multi-turn scenarios including calendar, email, maps, and productivity app simulations. • Documented acceptable behaviors and edge case handling for real-world coverage. • Used JSON/YAML as scenario definition and labeling format. • Work contributed directly to structured LLM agent evaluation and RLHF data pipelines.

2025 - Present

Singing Voice Corpus Annotation

OtherAudioTranscription
I annotated English singing voice data at the phoneme level, carefully marking millisecond-precision timestamps and pitch values in Hz. I labeled musical notes using Praat and Sonic Visualiser, and used Montreal Forced Aligner to automate phoneme alignment, drastically reducing manual effort. The result was structured TextGrid files and datasets for singing voice synthesis AI model training. • Data included phoneme segmentation, pitch annotation, and note labeling at high temporal resolution. • Used tools such as Praat, Sonic Visualiser, and Montreal Forced Aligner (MFA). • Delivered approximately 10 to 15 hours of annotated audio for corpus development. • Data supported AI model training for singing voice synthesis.

I annotated English singing voice data at the phoneme level, carefully marking millisecond-precision timestamps and pitch values in Hz. I labeled musical notes using Praat and Sonic Visualiser, and used Montreal Forced Aligner to automate phoneme alignment, drastically reducing manual effort. The result was structured TextGrid files and datasets for singing voice synthesis AI model training. • Data included phoneme segmentation, pitch annotation, and note labeling at high temporal resolution. • Used tools such as Praat, Sonic Visualiser, and Montreal Forced Aligner (MFA). • Delivered approximately 10 to 15 hours of annotated audio for corpus development. • Data supported AI model training for singing voice synthesis.

2025 - Present

Education

A

Ahmadu Bello University, Zaria

Bachelor of Engineering, Computer Engineering

Bachelor of Engineering
2016 - 2023
A

Ahmadu Bello University, Zaria

Diploma in Computer Engineering, Computer Engineering

Diploma in Computer Engineering
2014 - 2016

Work History

M

Murajaah AI Platform

Backend Developer

Zaria
2025 - Present
D

Deaf-Tech

Founder and Lead Engineer

Zaria
2025 - Present