LLM Fine-Tuning/Data Annotation for Legal Compliance AI
I fine-tuned the LLaMA 3.2-8B large language model on a curated dataset of Indian tax legal documents to adapt it for Indian Income Tax compliance queries. The process involved preparing, curating, and annotating legal texts and statutory documents to improve LLM legal advisory accuracy. My work centered on domain adaptation of instruction-following models for retrieval-augmented generation and factual precision. • Curated and annotated domain-specific legal texts for LLM training • Implemented QLoRA-based fine-tuning pipelines for language model adaptation • Evaluated and measured improvements in citation accuracy and context retrieval • Integrated with RAG pipelines and FAISS for relevant, context-aware query results