Content Reviewer & AI Model Evaluator (Contract/Remote)
Reviewed and rated AI-generated text responses for clarity, factuality, and adherence to instructions. Annotated and labeled structured text datasets to support supervised fine-tuning and calibration of large language models. Designed, iterated, and tested prompts to evaluate reasoning depth and edge case performance. • Evaluated outputs from ChatGPT, Claude, Gemini, and Midjourney. • Identified hallucinations, bias, ambiguity, and structural inconsistencies in AI outputs. • Delivered standardized Markdown-formatted evaluation reports to align review criteria. • Worked remotely as a contractor, maintaining consistent and accurate annotation practices.