Independent AI Evaluator & Technical Researcher
This experience involved evaluating and refining outputs from generative AI systems for logical accuracy and prompt engineering specificity. Tasks included identifying and reducing model hallucinations while ensuring that outputs met specified requirements. The work required critical analysis of AI-generated text in technical and business contexts. • Performed RLHF-based evaluations on LLM responses to user prompts. • Focused on advanced mathematics, physics, and logical problem-solving texts. • Used Make.com and custom AI agents to structure and analyze workflow outputs. • Designed prompts for business, sales, and STEM question-answering use cases.