AI Training Specialist (RLHF)
As an AI Training Specialist (RLHF) at Outlier, I evaluated LLM response quality and improved reasoning chains. I performed complex multimodal annotation, validating relationships and creating precise instruction tuning datasets. My work included benchmarking and refining artistic style transfer in model outputs. • Conducted RLHF evaluations and response selection on large language models • Labeled video–entity relationships to enhance RAG accuracy • Authored high-quality instruction tuning data for video and style reference pairs • Benchmarked image and video outputs using defined quality metrics