Project: Python Code Optimization & Functional Correctness Annotation
Annotated a dataset of 2,000+ Python snippets for an LLM training project. My role involved evaluating model-generated code for logical soundness, PEP 8 compliance, and algorithmic efficiency. Specific tasks included: Identifying 'hallucinated' libraries or non-existent methods. Categorizing code by complexity (O(n) vs O(n²)). Debugging syntax errors in snippets involving Data Science libraries (NumPy, Pandas). Ranking multiple model outputs based on readability and execution speed. Maintained a 97% internal audit score by ensuring all labels strictly followed the project’s security and performance guidelines.