LLM Alignment & RLHF-Based Code Evaluation Project
Worked on AI alignment and RLHF-based evaluation workflows for code-generation agents.Reviewing and comparing agent-generated code outputs for correctness, reasoning depth, and instruction alignment, Defining gold-standard behavioral expectations for multi-step execution workflows, Performing structured quality scoring and human preference ranking, Analyzing failure modes, execution logs, and reasoning traces, Iterating on prompts and evaluation instructions to improve robustness and reduce ambiguity Project Scope: Evaluated complex code-generation tasks across varied technical domains Applied structured criteria using JSON-based evaluation formats Conducted consistency checks to maintain scoring reliability Quality Measures: Multi-pass review system Cross-checking reasoning chains against task constraints Bias and logical coherence validation Focused on improving model alignment, reasoning reliability, and instruction-following behavior.