AI-Generated Text Evaluation via RL Training Environment
I built and launched an RL-based evaluation platform where AI agents assess and rate AI-generated text based on set rubrics. This involved designing reward systems, automating assessment criteria, and reproducibly benchmarking agent outputs. The experience required rigorous evaluation of AI outputs and structured feedback cycles using predefined metrics. • Created custom rubrics for scoring and evaluation of AI-generated text. • Integrated sentence-transformer similarity metrics for automated rubric assessments. • Ran RL training loops, providing continuous evaluation on agent improvements. • Deployed the evaluation pipeline as a live API for real-time testing and session tracking.