Image Annotation and Data Labeling Contributor
I conducted Reinforcement Learning from Human Feedback (RLHF) to improve language model performance, focusing on STEM and software logic subject matter. Duties involved evaluating responses from AI systems for mathematical and structural accuracy, ranking outputs, and flagging safety or logic errors. Step-by-step reasoning verification and prompt consistency audit were also key parts of my workflow. • Used uTest and Testlio as annotation and evaluation platforms. • Assessed logical flow and correctness for technical and academic prompts. • Specialized in STEM-focused prompt engineering and ranking. • Produced detailed documentation of errors and edge-case handling.