LLM model evaluation
The project consisted on evaluating the responses generated by two different models based on an identical input. Categories for prompts varied each time, and responses were examined accordingly on several axis of evaluation such as Localization/Fluency, Accuracy/Appropriateness and Factual veracity of data provided by the model, if any.