AI Engineer – Data Preparation and LLM-Driven Document Transformation
Engineered an end-to-end RAG pipeline automating the transformation of policy documents into unified formats suitable for LLM-driven analyses. Integrated OpenAI ada-002 embeddings, LangChain chunking, and Pinecone vector store to power semantic search and requirement matching. Enabled automated policy compliance gap analysis and large-scale document rewriting through LLMs for high coverage audit documentation. • Designed data transformation processes for over 100 JSON/PDF documents across compliance domains. • Developed pipelines for automated chunking and semantic embedding of documents. • Leveraged LLMs for text-based gap analysis and compliant document generation. • Conducted model evaluation to ensure high semantic retrieval accuracy.