RAG System Data Quality and Retrieval Evaluation
I conducted data annotation and quality evaluation for RAG (Retrieval Augmented Generation) systems. My responsibilities included document chunk labeling, semantic similarity annotation, and relevance scoring to optimize vector database retrieval. I curated and benchmarked training datasets for factual accuracy, context alignment, and reduced hallucination. • Labeled document chunks for semantic similarity, topic relevance, and retrieval quality • Evaluated answers for groundedness and improved RAG system performance benchmarks • Utilized ChromaDB, Pinecone, and Weaviate for data handling and evaluation tasks • Achieved measurable improvements in vector database retrieval and relevance