AI Data Training and RLHF Data Annotator
I performed fine-tuning of large language model (LLM) responses using reinforcement learning from human feedback (RLHF) and high-precision data annotation. My work contributed to improving AI-generated outputs and aligning them better with user expectations. The process involved reviewing, annotating, and evaluating AI responses to a variety of prompts on diverse topics. • Conducted RLHF-based evaluation of LLM-generated text outputs • Annotated and rated AI responses for accuracy, relevance, and appropriateness • Worked with internal or proprietary annotation tools to perform feedback tasks • Provided critical insights to improve LLM behavior and safety