AI Model Contributor (LLM Evaluator)
I evaluated and improved code and documentation generated by large language models using structured prompts. My responsibilities included comparing outputs from multiple models and providing detailed, production-focused feedback to enhance accuracy and reliability. The focus was on real-world GitHub repositories and improving AI-generated technical outputs. • Designed and executed structured prompt experiments for LLM evaluation • Provided detailed ratings and feedback on generated code and documentation • Performed side-by-side output comparisons of multiple LLMs • Ensured workflows enhanced reliability and usability for enterprise use