Project Aether
As an AI Model Evaluator, I provided reinforcement learning from human feedback (RLHF) by rating and annotating LLM-generated outputs. The work focused on maintaining output quality and identifying weaknesses through adversarial prompting. I contributed to continuous improvement by delivering structured feedback aligned to evolving guidelines. • Evaluated and ranked natural language outputs for coherence and alignment. • Designed and executed adversarial prompts to test reasoning and robustness. • Maintained high standards under shifting project guidelines. • Supported model development through detailed rating systems and feedback.