AI Text & Audio Labeling Expert in French/Arabic with Multilingual Evaluation
The project is focused on refining natural language models by evaluating, labeling, and ranking AI-generated responses. The work includes most of the time evaluating AI response quality by judging a prompt and 2 generated responses on several quality dimensions. Project size can take up to a week of work on tasks that take on average 30min to 1 hour to complete. Projects adheres to strict guidelines for quality control.