LLM Response Evaluation Project Contributor
Contributing to a Large Language Model Response Evaluation Project, I participated in reinforcement learning workflows. My responsibilities included ranking outputs, flagging errors, and submitting structured reviews to enhance model performance. I strictly adhered to comprehensive guideline instructions to ensure reliable results. • Compared and ranked AI-generated responses on various quality metrics. • Detected and reported hallucinations, inconsistencies, and reasoning errors. • Provided detailed written justification and suggestions for each evaluation. • Ensured adherence to project-specific annotation and evaluation procedures.