High-Precision Audio Labeling
Annotated 10+ hours of English singing audio at the phoneme level, ensuring temporal alignment within $1–5$ milliseconds for voice synthesis training. Conducted millisecond-precise annotation of pitch and musical note information, aligning vocal performance with symbolic musical scores. Executed high-precision labelling of vocal techniques, including breathing patterns and glottal stops, using spectrogram interpretation.