Senior LLM Engineer (Agent Systems & Evaluation), Moniepoint
Responsible for designing high-quality multi-turn conversational datasets for simulating user interactions with AI assistants. Evaluated LLM outputs across reasoning, coherence, and correctness to identify performance gaps. Iteratively improved datasets based on feedback and model failures for better alignment and robustness. • Created datasets for AI assistant agent behaviors, including function-calling workflows. • Modeled structured tool interactions using JSON schemas for API alignment. • Generated edge-case and ambiguous scenarios to test system limits. • Improved clarity and realism in data, ensuring guideline adherence and logical sequencing.