2025-11-13T14:19:10.992196

Can LLMs Reconcile Knowledge Conflicts in Counterfactual Reasoning

Yamin, Ghosal, Wilder

Large Language Models have been shown to contain extensive world knowledge in their parameters, enabling impressive performance on many knowledge intensive tasks. However, when deployed in novel settings, LLMs often encounter situations where they must integrate parametric knowledge with new or unfamiliar information. In this work, we explore whether LLMs can combine knowledge in-context with their parametric knowledge through the lens of counterfactual reasoning. Through synthetic and real experiments in multi-hop reasoning problems, we show that LLMs generally struggle with counterfactual reasoning, often resorting to exclusively using their parametric knowledge. Moreover, we show that simple post-hoc finetuning can struggle to instill counterfactual reasoning ability -- often leading to degradation in stored parametric knowledge. Ultimately, our work reveals important limitations of current LLM's abilities to re-purpose parametric knowledge in novel settings.

academic

Can LLMs Reconcile Knowledge Conflicts in Counterfactual Reasoning

基本信息

论文ID: 2506.15732
标题: Can LLMs Reconcile Knowledge Conflicts in Counterfactual Reasoning?
作者: Khurram Yamin*, Gaurav Ghosal*, Bryan Wilder (Carnegie Mellon University)
分类: cs.AI cs.LG
发表时间/会议: ICLR 2026
论文链接: https://arxiv.org/abs/2506.15732v2

摘要

大型语言模型（LLMs）在参数中包含了丰富的世界知识，在许多知识密集型任务上表现出色。然而，当部署在新环境中时，LLMs经常遇到必须将参数化知识与新的或不熟悉的信息相结合的情况。本研究通过反事实推理的视角探讨LLMs是否能够将上下文知识与其参数化知识相结合。通过在多跳推理问题中的合成和真实实验，研究表明LLMs在反事实推理方面普遍存在困难，往往仅依赖其参数化知识。此外，简单的后验微调难以植入反事实推理能力，常常导致存储的参数化知识退化。最终，该工作揭示了当前LLMs在新设置中重新利用参数化知识能力的重要局限性。

研究背景与动机

核心问题

本研究要解决的核心问题是：现代LLMs是否能够选择性地将参数化知识与上下文中的反事实前提相结合，以正确回答多跳问题？

问题重要性

实际应用需求：现实世界的许多场景需要LLMs将预训练知识与推理时提供的新颖或假设性信息相结合
知识冲突挑战：当外部文档与内部知识冲突时，检索增强生成面临困难
安全关键应用：在交互系统、检索增强管道和安全关键应用中，准确的条件推理至关重要

现有方法局限性

现有多跳QA基准主要评估模型回忆存储事实或组合参数化知识链的能力，不测试双重要求
知识冲突研究缺乏对反事实多跳推理的系统性探索
RAG方法虽能合并外部信息，但不能处理反事实推理的独特挑战

研究动机

通过反事实推理这一具体任务，系统性地研究LLMs在面对知识冲突时的表现，特别是需要同时进行上下文覆盖（Contextual Override）和选择性检索（Selective Retrieval）的能力。

核心贡献

反事实QA基准：引入基于合成图的任务和现实世界因果推理场景，分离出相对于预训练知识图的(i)强化、(ii)添加、(iii)矛盾和(iv)无关上下文情况
实证分析：通过GPT-4o和其他SOTA模型的实验，识别两种主要失败模式：(a)上下文忽略（模型默认使用存储事实）和(b)上下文过拟合（模型盲目遵循提示）
微调陷阱分析：证明简单的后验微调在反事实示例上往往只能带来边际收益，并可能通过诱导意外启发式而降低标准事实基准的性能
实践意义：讨论研究发现对交互系统、检索增强管道和安全关键应用的影响