Outlier projects
- create prompts and conversation with AI chatbots to naturally lead the model to a valid failure on the final round - failures such as self-coherence, inference memory, RVE, etc. - recreate a perfect model response that meets all the criteria - create target questions/ rubrics to evaluate overall performance between two different models