Achilles Evaluation
Data Annotation TechTextRLHF
Evaluating two AI model responses to previously generated prompts across a range of criteria, including; overall quality, factuality, writing quality, verbosity, harmfulness, and collaboration.
Evaluating two AI model responses to previously generated prompts across a range of criteria, including; overall quality, factuality, writing quality, verbosity, harmfulness, and collaboration.
2023 - 2023