Generalist – English & Arabic (AI LLM Evaluation)
Assessed and improved general conversational and written Arabic-English LLM outputs for fluency, tone, and cultural nuance. Delivered detailed bilingual prompt–response pairs and provided error feedback to support frontier LLM training for AI labs. Participated in live prompt evaluation workflows, directly impacting model performance and reliability. • Evaluated both written and spoken AI output for linguistic quality. • Addressed real user queries as part of model refinement. • Raised quality standards through context-aware feedback. • Facilitated bilingual model development for improved responses.