micro1.ai
To evaluate tasks based primarily on the deliverable files Identify what each completion does well. Focus on how the deliverable meets the request in the prompt. Point out where each completion falls short. Clearly stating whether one completion is better, worse, or equal to the other. Completing the 3 required completion states in the Simulated Applications Environment and filling in the additional fields, such as difficulty, overall score, realism, and the rest of the evaluation form. This helps assess whether the grader, prompt, and environment are functioning correctly.