LLM Response Evaluation & Training (Project)
I worked on evaluating and improving the accuracy of LLM responses by designing complex prompts. I audited AI-generated content for hallucinations, logical consistency, and subject matter accuracy. My efforts supported the overall supervised fine-tuning process for generative AI systems. • Created and rated prompts-responses pairs for LLM testing. • Annotated datasets to support alignment and logical evaluation. • Analyzed AI content for truthfulness, bias, and fluency. • Collaborated with other annotators to standardize rating guidelines.