AI Code Evaluator & Prompt Engineer (Freelance / Contract)
As an AI Code Evaluator & Prompt Engineer (Freelance/Contract), I conducted deep technical evaluations of large language models, particularly Claude. I crafted complex, multi-file coding prompts to test AI reasoning and performed analysis to uncover hallucinations and logic errors in AI-generated code. I oversaw adversarial testing and containerized evaluation workflows to ensure reproducible, secure training data and feedback for RLHF pipelines. • Developed and administered coding prompts focused on context retention, reasoning, and model robustness. • Documented failure cases, edge behaviors, and security implications for AI model improvement. • Utilized Docker and version control to enable isolated evaluation of AI-generated code. • Supported AI training processes by providing structured feedback and expert code analysis.