Technical AI Trainer & RLHF Specialist
As a Technical AI Trainer and RLHF Specialist, I performed adversarial testing on advanced large language models to identify security vulnerabilities and logical errors. My work included conducting Supervised Fine-Tuning (SFT) with expert-level code and problem solutions and auditing AI-generated code for hallucinations and incorrect logic. I created high-level Chain-of-Thought (CoT) responses for mathematical, medical, and legal AI training workflows. • Exposed vulnerabilities in reasoning and logic in LLMs via red teaming and adversarial prompt creation. • Drafted reference solutions in Java and C++ for complex data structures and algorithm tasks. • Identified non-idiomatic or faulty code patterns in model outputs to improve generation accuracy. • Generated CoT reasoning data to guide AI on solving multi-step, high-complexity queries.