AI Content Evaluation & RLHF Prompt Rating
Evaluated AI-generated responses across multiple domains for clarity, reasoning depth, factual correctness, instruction following, and language quality. Performed pairwise comparisons, ranking and structured feedback for improvement of model outputs. Tasks included prompt testing, summarization, question answering, and rewriting for higher quality responses. Worked in asynchronous remote workflows, applying detailed guidelines and consistency standards for RLHF-style evaluation.