Prompt- Response Evaluation for LLM Fine-Tuning
I participated in a text-based AI training project on the Outlier platform focused on LLM prompt–response evaluation. My role involved reviewing and rating AI-generated responses based on clarity, accuracy, relevance, and overall helpfulness. I also selected preferred completions, flagged unsafe content, and provided improvement suggestions when required. The project followed strict quality guidelines, and I consistently maintained high accuracy and adherence to instructions. This work helped improve the performance and safety of large language models by providing valuable human feedback. The tasks were completed using Outlier’s custom web-based labeling interface.