AI Content Trainer / Data Evaluator
Evaluated and ranked AI-generated responses for a Large Language Model (LLM) project. tasks included comparing multiple model outputs and rating them based on helpfulness, honesty, and safety (RLHF). I also wrote original "demonstration data" to teach the model how to answer complex user prompts correctly. Maintained a 98% accuracy score on weekly quality audits.