Freelance Copywriter & Model Evaluator
In this role, I evaluated and ranked AI-generated responses for accuracy, reasoning, clarity, and instruction adherence. I also produced high-quality prompts and responses for large language model fine-tuning datasets. I systematically identified issues related to hallucinations and inconsistencies, while assessing bias, safety, and compliance risks. • Evaluated 5,000+ AI responses using structured rubrics • Authored 3,000+ prompts for supervised fine-tuning (SFT) • Flagged model weaknesses and safety issues • Maintained high throughput and accuracy benchmarks