Multilingual LLM Evaluation and Text Classification for African Language Models
This project involved labeling and evaluating a multilingual dataset designed to improve large language model (LLM) performance in low-resource African languages, specifically Kiswahili and Kalenjin. Tasks included sentiment tagging, named entity recognition, prompt-response evaluations, translation quality checks, and rating AI-generated text for accuracy, fluency, and cultural appropriateness. I was also involved in refining prompt engineering strategies to train and test generative models in diverse linguistic contexts. The project included over 15,000 labeled text samples and adhered to strict quality control measures, including double-pass reviews and inter-annotator agreement checks.