AI Data Annotator, Coding and Model Evaluation
At Handshake AI, I evaluated AI-generated coding solutions for accuracy, quality, and reasoning. I created detailed feedback to help improve the performance of large language models on programming tasks. I wrote prompts and challenging test scenarios to further strengthen code generation capabilities. • Evaluated model outputs across Python, JavaScript, and other coding languages. • Wrote edge-case scenarios to test model reasoning. • Provided structured feedback on code accuracy and logic. • Supported improvements in LLM benchmark performance.