French specialist
The Argon project concisted in talking to the LLM via various prompts depending on the category, and then rating which of the two models had a better response depending on various areas (localization, truthfulness, harmlessness, instruction following ...etc)