We are seeking support to create an English singing voice synthesis corpus consisting of 10 to 15 hours of recorded singing. The work requires labeling the dataset at the phoneme level, with precise start and end timestamps annotated in milliseconds. Additionally, each recording must have pitch or musical note information annotated with equivalent precision. Reference can be made to open source projects such as GTSinger for guidance throughout the process.
Total Budget
$3,000
Pay per Label
-
Time Requirement
Less than 20 hrs/week
Duration
1 month
English singing audio, 10-15 hours, phoneme and pitch...
Software
Hiring Type
Required Location
Workload / Schedule
Expected weekly commitment is under 20 hours. Project duration is expected to be under 1 month. Labelers should follow milestone deadlines and quality checkpoints.
Software
Data Type
Label Types
Subject Matter / Industry
Share link