Language Training for AI models
The projects focus on evaluating the quality of responses generated by two different AI models. I assess how effectively each response addresses its prompt, considering factors like cultural appropriateness (localization), factual accuracy, adherence to specific instructions, and the absence of offensive or misleading content (harmlessness). Each response will be scored on a scale across these criteria, with justifications provided to ensure transparency in the evaluation process. Additionally, the project adheres to a specific average handling time per task. The overall quality of work is measured by the accuracy of the ratings, the validity of the justifications, and the ability to meet the established timeframes. Tthe project also encompasses tasks like creating prompts and responses based on defined criteria and verifying the factuality of responses using designated tools. I also evaluate the quality of the voice mode employed by each models.