AI Training Specialist (RLHF Model Optimization)
As an AI Training Specialist at Outlier, I produced and reviewed backend-focused code samples and tests to train and evaluate LLM coding behavior via RLHF. I improved generated code correctness by applying engineering best practices, including input validation, error handling, and edge-case coverage. Rubric-based reviews were used to ensure high-quality, diverse code generation for model improvement. • Generated and curated code samples in Python and TypeScript • Applied comprehensive review standards for LLM evaluation • Ensured code reliability and accuracy for RLHF workflows • Focused on improving the quality of code generation in LLMs