Senior Software Developer / AI Evaluation Contributor
Evaluated AI-generated code outputs for correctness, performance, and reliability using custom-built testing frameworks. Played a key role in validating LLM-assisted code understanding and model output assessment. Contributed to AI evaluation initiatives for short-term, project-based tasks in a freelance/contract capacity. • Assessed AI responses in English and multi-language codebases (Python, C, Rust, Go) • Used Python, Docker, CI pipelines, and manual review processes • Collaborated with remote teams to ensure robust evaluation standards • Supported cross-language migration workflows with behavioral validation