AI Training & Evaluation Contributor (Contract)
As an AI Training & Evaluation Contributor for Xelron AI, I designed and assessed challenging tasks for large language models as part of benchmarking and evaluation projects. My work involved creating adversarial prompts, conducting detailed response evaluations, and providing structured feedback on advanced model reasoning. I contributed to both natural language and code-based tasks, supporting improvements in both general and technical AI accuracy. • Developed high-difficulty reasoning benchmarks to test LLM capabilities • Evaluated model outputs for correctness, depth, and consistency • Provided structured human feedback to identify strengths and errors in reasoning • Compared AI models on performance metrics such as code logic and documentation quality.