AI Data Quality Analyst – DataLab Africa Ltd.
As an AI Data Quality Analyst, I designed annotation schemas for multilingual NLP datasets including sentiment, named entity recognition, and dialogue act classification. I established quality-control pipelines incorporating inter-annotator agreement metrics and automated preprocessing to enhance workflow efficiency. My efforts included creating and maintaining comprehensive style guides, running calibration sessions, and collaborating on RLHF preference annotation guidelines. • Built ground-truth datasets for Swahili, English, and Kikuyu fine-tuning. • Automated labeling workflows with spaCy and Python tools. • Annotated 15,000 RLHF response pairs using preference rubrics. • Led quality assurance and ontology consistency across 200,000 samples.