AI Alignment & Technical Subject Matter Expert (Safaricom AI Tribe)
I optimized high-dimensional conversational datasets for M-Pesa’s AI-driven interfaces using RLHF to enhance accuracy. I conducted rigorous red teaming and evaluation of frontier LLMs, identifying and patching logical and factual errors. I also developed 'Gold Standard' datasets focused on chain-of-thought reasoning in fintech queries with high factual accuracy. • Labeled and audited conversational data for accuracy, compliance, and hallucination mitigation. • Used RLHF and COT dataset techniques to optimize multi-step reasoning tasks in the fintech domain. • Red-teamed advanced LLMs (GPT-4o, Claude 3.5) to identify model vulnerabilities and data inaccuracies. • Created detailed technical datasets for benchmarking, error correction, and logic evaluation.