2025-11-16T09:46:12.577001

ICA-RAG: Information Completeness Guided Adaptive Retrieval-Augmented Generation for Disease Diagnosis

He, Jia, Jia et al.
Retrieval-Augmented Large Language Models (LLMs), which integrate external knowledge, have shown remarkable performance in medical domains, including clinical diagnosis. However, existing RAG methods often struggle to tailor retrieval strategies to diagnostic difficulty and input sample informativeness. This limitation leads to excessive and often unnecessary retrieval, impairing computational efficiency and increasing the risk of introducing noise that can degrade diagnostic accuracy. To address this, we propose ICA-RAG (\textbf{I}nformation \textbf{C}ompleteness Guided \textbf{A}daptive \textbf{R}etrieval-\textbf{A}ugmented \textbf{G}eneration), a novel framework for enhancing RAG reliability in disease diagnosis. ICA-RAG utilizes an adaptive control module to assess the necessity of retrieval based on the input's information completeness. By optimizing retrieval and incorporating knowledge filtering, ICA-RAG better aligns retrieval operations with clinical requirements. Experiments on three Chinese electronic medical record datasets demonstrate that ICA-RAG significantly outperforms baseline methods, highlighting its effectiveness in clinical diagnosis.
academic

ICA-RAG: Information Completeness Guided Adaptive Retrieval-Augmented Generation for Disease Diagnosis

Basic Information

  • Paper ID: 2502.14614
  • Title: ICA-RAG: Information Completeness Guided Adaptive Retrieval-Augmented Generation for Disease Diagnosis
  • Authors: Jiawei He, Mingyi Jia, Zhihao Jia, Junwen Duan, Yan Song, Jianxin Wang
  • Classification: cs.CL (Computation and Language)
  • Publication Date: arXiv preprint (latest version October 15, 2025)
  • Paper Link: https://arxiv.org/abs/2502.14614

Abstract

Retrieval-augmented large language models (RAG-LLMs) demonstrate superior performance in the medical domain by integrating external knowledge, particularly for clinical diagnosis. However, existing RAG methods struggle to tailor retrieval strategies based on diagnostic difficulty and information completeness of input samples, resulting in excessive and unnecessary retrieval that compromises computational efficiency and increases the risk of introducing noise, thereby reducing diagnostic accuracy. To address this issue, this paper proposes ICA-RAG (Information Completeness-guided Adaptive Retrieval-Augmented Generation), a novel framework that enhances the reliability of RAG in disease diagnosis. ICA-RAG leverages an adaptive control module to assess retrieval necessity based on input information completeness, optimizing retrieval operations and knowledge filtering to better align retrieval with clinical requirements. Experiments on three Chinese electronic medical record datasets demonstrate that ICA-RAG significantly outperforms baseline methods, highlighting its effectiveness in clinical diagnosis.

Research Background and Motivation

Problem Background

Large language models face two major challenges in medical tasks:

  1. Hallucination Problem: Generation of seemingly plausible but factually incorrect information
  2. Knowledge Update Cost: Resource-intensive maintenance of up-to-date medical knowledge

Limitations of Existing RAG Methods

  1. Lack of Selective Retrieval Logic: Indiscriminately performing retrieval for all queries, increasing computational and temporal costs
  2. Introduction of Low-Quality Retrievals: May degrade rather than enhance performance through irrelevant information
  3. Domain-Specific Characteristics of Medicine: Many common diseases or cases with mild symptoms and clear diagnoses can be accurately diagnosed without retrieval

Insufficiencies of Existing Adaptive RAG

  1. Methods Based on LLM Output Distribution: LLMs tend to be overconfident, generating high-confidence distributions even when lacking relevant knowledge
  2. Methods Based on Classification Models: In the medical domain, input text typically lacks obvious structural patterns, making it difficult for small language models to understand task difficulty

Core Contributions

  1. Proposes ICA-RAG Framework: An adaptive retrieval-augmented disease diagnosis framework that requires no fine-tuning of the backbone LLM
  2. Innovative Data Annotation Method: Designs an annotation strategy based on masking operations, obtaining label information by triggering different LLM responses
  3. Optimizes Retrieval Process: Tailors the retrieval pipeline for complex clinical contexts
  4. Experimental Validation: Conducts extensive experiments on three Chinese EMR datasets, demonstrating framework effectiveness

Methodology Details

Task Definition

Direct Disease Diagnosis: Given a token sequence x=[x1,x2,...,xn]x = [x_1, x_2, ..., x_n] representing input text, LLM text generation can be formalized as: D^=LLM(Q,prompt)\hat{D} = \text{LLM}(Q, \text{prompt})

RAG Disease Diagnosis: Retrieving and integrating relevant knowledge from external sources: D^=LLM(Q,d,prompt)\hat{D} = \text{LLM}(Q, d, \text{prompt}) where d=Retriever(K,Q)d = \text{Retriever}(K, Q)

Adaptive RAG Disease Diagnosis: Introducing a control function F to assess input Q:

\text{LLM}(Q, \text{prompt}), & \text{if } F(Q) = \langle\text{Activate}\rangle \\ \text{LLM}(Q, d, \text{prompt}), & \text{otherwise} \end{cases}$$ ### Model Architecture The ICA-RAG framework comprises three main stages: #### Stage (a): Retrieval Decision Optimization Based on Input Information Completeness 1. **Text Segmentation**: Dividing input Q into text units (sentences by default): $Q = \{s_i\}_{i=1}^n$ 2. **Importance Classification**: Training a classifier to predict the importance of each unit: $$l_i = \text{Classifier}(s_i) \quad \forall i \in \{1, 2, ..., n\}$$ Labels are categorized into three classes: - A: Information critical for diagnostic decision-making - B: Information that positively contributes to retrieval but cannot directly infer results - C: Relatively unimportant information 3. **Information Completeness Calculation**: $$I_{\text{norm}}(Q) = \frac{1}{\alpha \cdot n} \sum_{i=1}^n (\alpha \cdot I(l_i = A) + \beta \cdot I(l_i = B) + \gamma \cdot I(l_i = C))$$ #### Stage (b): Retrieval Based on Document Segmentation and Mapping 1. **Sentence-Level Retrieval**: Each sentence serves as a query to retrieve top-m relevant text chunks 2. **Document-Level Re-ranking**: Re-ranking documents based on the count of retrieved chunks per document 3. **Mapping Strategy**: Mapping text chunks back to original documents and re-ranking based on chunk counts #### Stage (c): Knowledge Filtering and Diagnostic Generation Based on Prompt Guidance Using a differential diagnosis prompt template to filter irrelevant documents, simulating the physician's differential diagnosis process. ### Technical Innovations 1. **Information Completeness Assessment**: Transforming complex document understanding into simple sentence-level tasks 2. **Masking Annotation Strategy**: Automatically obtaining training labels through sequence masking operations 3. **Chunk-Document Mapping Re-ranking**: Computing re-ranking based solely on retrieval result values, reducing memory overhead 4. **Differential Diagnosis Filtering**: Filtering irrelevant information by simulating the clinical differential diagnosis process ## Experimental Setup ### Datasets - **CMEMR**: Chinese Electronic Medical Record dataset - **ClinicalBench**: Clinical benchmark dataset - **CMB-Clin**: Chinese Medical Benchmark Clinical dataset All datasets are configured as end-to-end diagnostic tasks, with patient information as input and physician diagnostic conclusions as ground truth labels. ### Evaluation Metrics Using International Classification of Diseases (ICD-10) standardized disease terminology, computing set-level Precision, Recall, and F1-score using fuzzy matching (threshold 0.5). ### Baseline Methods 1. **Non-Retrieval Methods**: CoT, SC-CoT, ATP 2. **Standard Retrieval Methods**: RAG2, LongRAG 3. **Adaptive Retrieval Methods**: Adaptive-RAG, DRAGIN, SEAKR ### Implementation Details - **Backbone Model**: qwen2.5-7B-instruct - **Classifier**: BERT-base-Chinese - **Retriever**: BM25 - **External Knowledge Base**: CMKD Clinical Medical Knowledge Database ## Experimental Results ### Main Results | Method | CMEMR F1(%) | ClinicalBench F1(%) | CMB-Clin F1(%) | |--------|-------------|---------------------|-----------------| | CoT | 48.82 | 38.46 | 52.14 | | LongRAG | 49.07 | 39.25 | 51.81 | | Adaptive-RAG | 49.27 | 38.04 | 53.44 | | **ICA-RAG** | **50.88** | **40.79** | **53.53** | Key Findings: 1. ICA-RAG achieves optimal or near-optimal F1 scores across all datasets 2. Compared to LongRAG, F1 improvements of 1.81%, 1.54%, and 1.72% respectively 3. Significantly outperforms other adaptive RAG methods ### Ablation Study Ablation results on the CMEMR dataset: | Variant | F1(%) | Decrease | |---------|-------|----------| | ICA-RAG | 50.88 | - | | w/o Decision | 48.07 | -2.81% | | w/o Chunk | 49.78 | -1.10% | | w/o M-rerank | 49.59 | -1.29% | | w/o Diff | 49.85 | -1.03% | ### Efficiency Analysis - **Temporal Efficiency**: Significant improvements compared to non-adaptive RAG methods - **Parameter Efficiency**: BERT-Base classifier (110M parameters) is more lightweight than Adaptive-RAG's T5-Large (770M parameters) - **Applicability**: No need to access LLM output probability distributions, applicable to closed-source models and API deployments ## Related Work ### RAG Applications in Clinical Disease Diagnosis - Most research employs basic retrieval methods, encoding external knowledge and task queries through embedding models - Knowledge graphs are also widely adopted - Lack of optimization tailored to medical domain characteristics ### Adaptive RAG - **FLARE and DRAGIN**: Activate search when LLM generates low-confidence tokens - **Self-RAG**: Train models to dynamically retrieve, critique, and generate text - **Adaptive-RAG**: Assess query complexity to determine retrieval necessity - Existing methods primarily target question-answering tasks, difficult to directly transfer to medical diagnosis ## Conclusions and Discussion ### Main Conclusions ICA-RAG effectively addresses the rigid retrieval strategy problem in traditional retrieval-augmented methods by optimizing adaptive retrieval decisions based on input information completeness, demonstrating strong adaptability in complex clinical scenarios. ### Limitations 1. **Annotation Strategy Constraints**: Due to potential redundancy in patient information, LLMs may still reach correct diagnoses after masking key sentences, leading to inaccurate annotation labels 2. **Complexity of Medical Text**: Clinical medical text contains abbreviations, synonyms, and aliases, with significant variations in documentation across different physicians, affecting retrieval accuracy 3. **Manual Verification Requirements**: Automatic annotation strategies still require human inspection and correction ### Future Directions 1. Explore more effective medical text preprocessing strategies to enhance retrieval quality 2. Apply ICA-RAG to other medical tasks 3. Further optimize the retrieval process ## In-Depth Evaluation ### Strengths 1. **Strong Innovation**: First to propose an adaptive retrieval decision mechanism based on information completeness 2. **High Practicality**: Requires no fine-tuning of the backbone LLM, with strong applicability 3. **Comprehensive Experiments**: Thorough evaluation and ablation studies across multiple datasets 4. **Efficiency Improvements**: Significantly enhances computational efficiency while maintaining performance ### Weaknesses 1. **Dataset Limitations**: Validation only on Chinese EMR datasets, lacking cross-lingual and cross-domain verification 2. **Annotation Quality**: Automatic annotation strategy contains noise, requiring human intervention 3. **Threshold Setting**: Lacks theoretical guidance for setting information completeness thresholds θ₁ and θ₂ 4. **Knowledge Base Dependency**: Performance heavily depends on external knowledge base quality ### Impact 1. **Academic Contribution**: Provides new insights for RAG applications in medical AI 2. **Practical Value**: Can be directly applied to clinical decision support systems 3. **Reproducibility**: Detailed method description and clear experimental setup ### Applicable Scenarios 1. **Clinical Diagnosis**: Particularly suitable for cases with complex symptoms requiring differential diagnosis 2. **Medical Question-Answering Systems**: Can enhance accuracy and efficiency of medical consultation systems 3. **Medical Education**: Can serve as an auxiliary tool for medical student learning ## References The paper cites 41 relevant references covering important works in large language models, retrieval-augmented generation, medical AI, and other domains, providing a solid theoretical foundation for the research. --- **Overall Assessment**: This is a high-quality paper with significant contributions to the medical AI field. The authors address limitations of existing RAG methods in medical diagnosis and propose an innovative solution, validated through comprehensive experiments. Despite certain limitations, its innovation and practicality make it an important advance in the field.