LLM Evaluation and Text Annotation Specialist
Evaluated outputs from large language models by assessing accuracy grammar factual correctness, and bias detection. Annotated multilingual text datasets (English, French, Arabic) for tasks such as classification, sentiment analysis, and NER. Delivered high-quality, consistent annotations with 98%+ accuracy across thousands of text samples. Collaborated with quality assurance teams to improve prompt design and optimize dataset reliability for training AI models.