AI Trainer
Evaluated and ranked 500+ LLM-generated responses across reasoning, summarization, and knowledge-based tasks Applied structured scoring rubrics assessing coherence, factuality, alignment, and instruction adherence Identified hallucinations and categorized failure modes (fabrication, logical gaps, unsupported claims) Maintained 95%+ task acceptance rate under QA audits Delivered written rationales to support comparative ranking decisions Contributed to refinement of training datasets used to improve model performance Completed time-sensitive batches while meeting quality benchmarks and consistency standards