Datamundi
* Checked model answers by breaking them into individual claims and verifying each claim with web sources * Wrote supervised training answers from evidence sets and fixed incorrect or incomplete sample answers * Evaluated retrieval and summarization quality by comparing system outputs with gold references * Selected and reviewed supporting paragraphs and sources for relevance, quality, and trustworthiness * Identified recurring failure patterns and provided structured feedback to improve model behavior