Apollo Custom Multi-Challenge (090)
In this project, you were given a text based conversation between an a user and an AI assistant with multiple turns. If a conversation fitted the criteria to be assessed, you would identify what type of failure the assistant made in the conversation. You have to come up with three high quality questions meeting specific criteria to ask the assistant which would test if the assistant has failed, then provide what the answer would actually be and what it should be if correct.