Data Scientist Specialist (LLM Evaluation/AI Training, Verizon)
As a Data Scientist Specialist at Verizon, I designed and implemented LLM evaluation pipelines for grounding checks, relevance scoring, and observability of AI models. I developed and tested real-time retrieval-augmented generation (RAG) chatbots and optimized embedding and retrieval workflows to minimize hallucinations. My work involved iterative prompt engineering, creating multi-step conversational flows, and building annotation tools for model output evaluation. • Led the development of semantic chunking and vector search optimizations for improved LLM output accuracy. • Utilized LangChain, LangGraph, and Pinecone to structure RAG data and label user interactions with high contextual fidelity. • Built and deployed RAG evaluation pipelines using RAGAS and LangSmith for systematic model performance feedback. • Drove token usage, prompt/response tracing, and prompt evaluation as part of LLMOps best practices.