LLM Post-Training Intern (Remote)
As an LLM Post-Training Intern at Ethara.AI, I performed reinforcement learning from human feedback (RLHF) evaluation on text-to-image outputs. My work involved assessing how well generated images adhered to provided instructions and identifying visual artifacts. I conducted pairwise comparisons and rankings to optimize AI model responses and wrote detailed justifications for my decisions. • Evaluated text-to-image generations for instruction adherence and quality • Conducted RLHF-based pairwise comparison and image ranking • Identified artifacts and inconsistencies in AI-generated images • Produced structured, reasoned explanations for preference decisions.