LLM Evaluator and AI Data Quality Specialist
I evaluated large language models (LLMs) for accuracy, coherence, and compliance with provided guidelines. My work entailed the use of reinforcement learning from human feedback (RLHF) methodologies with a focus on linguistic quality and prompt engineering. This included granular evaluation of syntactic and semantic outputs for medical and general contexts. • Ensured grammatical and factual accuracy across diverse language tasks. • Reviewed AI-generated responses for logical coherence and guideline adherence. • Leveraged advanced knowledge in biochemistry and medical sciences for domain-specific fact-checking. • Utilized RLHF methods and evaluated LLM output quality on a proprietary virtual platform.