Data Annotator
As a reviewer in the Xylophone Panda Prompt Generation project, I contributed to training an AI speech model to generate natural, human-like conversational responses. My role involved creating and evaluating authentic audio prompts across diverse categories—such as high-quality dialogues, knowledge-based exchanges, and role-play scenarios—while ensuring each recording reflected real-world spontaneity (e.g., pauses, inflections, and background noise). I adhered to strict quality measures, including: Natural Delivery: Avoiding scripted tones and capturing genuine speech patterns (e.g., sighs, laughter, filler words like "um"). Contextual Alignment: Matching prompts to assigned conversation types and sub-categories (e.g., problem-solving or immersive role-play). Transcript Accuracy: Verifying transcripts mirrored audio exactly, including pauses marked with ellipses (…) and contractions (e.g., "I’m" vs. "I am").