LLM Trainer
I contributed as a large language model (LLM) trainer, participating in the training, development, and evaluation of AI models such as ChatGPT, Gemini, Grok, and Claude. My responsibilities included testing multilingual performance, factual accuracy, response handling, and evaluating AI responses for quality and completeness. I collaborated with teams to develop evaluation criteria, scoring rubrics, and continually improved the reliability and performance of AI models. • Evaluated model responses in multiple languages • Detailed performance and factuality checks • Contributed to rubric and criteria design • Focused on improving LLM reliability and accuracy