AI Content Quality Specialist
I contributed to the fine tuning of a Large Language Model by performing Reinforcement Learning from Human Feedback (RLHF). My responsibilities included evaluating model-generated responses based on complex criteria such as factual accuracy, helpfulness, tone, and safety. I ranked multiple AI outputs, provided detailed reasoning for preferred responses, and identified subtle hallucinations or logical inconsistencies. By processing multiple prompt response pairs, I directly assisted in reducing model bias and improving the natural language reasoning capabilities of the system, ensuring the output met high quality human standards.