AI Content Evaluator & Trainer (Freelance)
I evaluated thousands of AI-generated responses for accuracy, helpfulness, tone, and safety across a range of subject matter, including general knowledge, creative writing, and crypto/Web3. I wrote and curated high quality prompt-response pairs for large language model (LLM) training datasets, strictly following annotation guidelines and rubrics. I used reinforcement learning from human feedback (RLHF) methods to rank outputs, flag policy violations, and provide detailed rationales for model improvement. • Maintained consistent quality scores above platform benchmarks. • Used Outlier AI and DataAnnotation.tech annotation systems for large scale labeling workflows. • Focused on nuance, objectivity, and policy adherence in prompt evaluation and content rating. • Contributed to iterative model improvement through written feedback and rationales.