Multilingual NLP Dataset Preparation for Neural Machine Translation
Prepared and structured over 15,000 bilingual Hausa–English sentence pairs for training a neural machine translation model. Performed dataset cleaning, tokenization, sequence alignment, and preprocessing to support supervised learning workflows. Evaluated translation quality using BLEU scoring and reviewed model outputs to improve contextual accuracy. This work involved annotation-style dataset validation, multilingual text alignment, and iterative quality improvement for AI training pipelines.