Toloka
The project consists of various tasks including audio classification and generation, response evaluation, transcription, and other tasks designed to train the project's voice assistant.
Hire this AI Trainer
Sign in or create an account to invite AI Trainers to your job.
No subject matter listed
With over three years of hands-on experience in data labeling and AI training data preparation, I have worked extensively across leading platforms such as Toloka, Yandex, and Appen, contributing to a wide range of high-impact projects. My expertise spans image and audio classification, transcription, and annotation, with a strong focus on Arabic LLM training to enhance natural language understanding for AI models. I have successfully completed dozens of transcription projects using various industry-standard tools, ensuring high accuracy, consistency, and adherence to project guidelines. Professional Experience Freelancer – Data Labeling & Transcription Specialist Platforms: Toloka, Yandex, Remotask, Oneforma, Appen, Hamssa, Dataplus, Speedx, Mihub, Magic Data, TCS, Data Factory 2021 – Present Notable Projects: Alibaba (via Appen): Transcribed over 28 hours of Arabic recordings in Egyptian, Saudi, Iraqi, and Levantine dialects. Served as a quality reviewer for the same project. Appen Linguistic Proofreading: Proofreading Arabic text with diacritic placement. Magic Data Projects: Completed three linguistic proofreading and audio-text alignment projects. Pratt Tool Projects: Completed six Egyptian dialect transcription projects covering talk shows, lectures, TV series, and podcasts. Mihub: Transcribed customer service calls with high accuracy. Speedx: Transcribed and annotated customer service conversations and calls. Toloka & Yandex: Voice assistant training for Yango
The project consists of various tasks including audio classification and generation, response evaluation, transcription, and other tasks designed to train the project's voice assistant.
The project is to review the automated transcription on the Magic Data platform. The dialect is Modern Standard Arabic. My task is to take into account the transcription and correct spelling errors, write down thinking, stuttering, and repetition in words that the automated transcription ignores. I correct segmentation errors, and put appropriate tags in the case of laughter, noise, coughing, etc. The accuracy I work with is +95%.
Transcribed over 28 hours of Arabic recordings in Egyptian, Saudi, Iraqi, and Levantine dialects. Served as a quality reviewer for the same project. The content is diverse and includes lectures, talk shows, podcasts, TV shows, and series.The project was large, exceeding 1000 hours of audio.
The project involved reviewing the transcription, formal Arabic dialect, correcting spelling errors, timestamping and separating each speaker in a segment, identifying areas of overlap, and tagging appropriately. The data was academic interviews and dialogues.
The project involves transcribing audio files, each one ranging from 10 to 40 minutes, using the Praat tool, and presenting the final form in a text grid file. In the Egyptian dialect, the data was varied, including talk shows, podcasts, series, academic lectures, and medical conferences. The work includes accurate transcribing, segmentation, and placing appropriate tags in the case of laughter, noise, coughing, etc. The project was long and divided into six stages, each stage consisting of 1000 hours of audio. The accuracy I work with is +98%. I also worked as a quality controller to correct the files of the transcribers and raise their quality to 99%.
Bachelor of Education, English Language
Audio Transcription Reviewer