AI Training & Evaluation for Code-Centric Large Language Models
•Performed high-precision data labeling and annotation for code generation, code analysis, and structured problem-solving tasks across Python and query-based workflows. •Designed and refined complex prompts to evaluate model performance on algorithmic reasoning, debugging, and multi-step coding tasks. •Reviewed and corrected AI-generated code outputs to ensure syntactic correctness, logical soundness, and adherence to best practices in software development. •Annotated model responses with detailed feedback highlighting reasoning errors, inefficiencies, and opportunities for optimization to support supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). •Evaluated AI outputs for machine-learning–related concepts, including data preprocessing logic, feature reasoning, and model-agnostic analytical workflows. •Curated high-quality prompt–response datasets to support model training, validation, and regression testing.