LLM Training & Evaluation – Text Data Labeling Project
Worked on a project-based AI training initiative focused on improving Large Language Model (LLM) performance. Tasks included writing high-quality prompt–response examples for Supervised Fine-Tuning (SFT), evaluating and ranking multiple AI-generated responses, providing clear justifications, and flagging factual errors, hallucinations, bias, and safety issues. Also performed RLHF-style preference rating and red teaming to stress-test edge cases and unsafe scenarios. Followed strict project rubrics, quality benchmarks, and confidentiality guidelines to ensure consistent, reliable outputs at scale.