Image Document Classification
We conducted a crucial document classification project to streamline the document processing pipeline for an advanced Q&A system using Large Language Models (LLMs). Scope of the project: The primary goal was to perform binary classification of documents into two categories: "simple" (containing only text) or "complex" (containing any visual elements such as charts, graphs, tables, etc.). This classification determines whether additional processing with a Vision-capable Large Language Model (VLLM) is required for improved indexing and Q&A performance. Specific data labeling tasks performed: Binary document classification: Categorized each document as either simple or complex based on visual inspection. Visual element presence check: Identified the presence of any non-textual elements that would classify a document as complex. Project size: We classified approximately 5000 documents from various domains, including financial reports, scientific papers, technical manuals, etc.