2025-11-18T11:46:20.272494

Do Large Language Models Show Biases in Causal Learning? Insights from Contingency Judgment

Carro, Mester, Selasco et al.

Causal learning is the cognitive process of developing the capability of making causal inferences based on available information, often guided by normative principles. This process is prone to errors and biases, such as the illusion of causality, in which people perceive a causal relationship between two variables despite lacking supporting evidence. This cognitive bias has been proposed to underlie many societal problems, including social prejudice, stereotype formation, misinformation, and superstitious thinking. In this work, we examine whether large language models are prone to developing causal illusions when faced with a classic cognitive science paradigm: the contingency judgment task. To investigate this, we constructed a dataset of 1,000 null contingency scenarios (in which the available information is not sufficient to establish a causal relationship between variables) within medical contexts and prompted LLMs to evaluate the effectiveness of potential causes. Our findings show that all evaluated models systematically inferred unwarranted causal relationships, revealing a strong susceptibility to the illusion of causality. While there is ongoing debate about whether LLMs genuinely understand causality or merely reproduce causal language without true comprehension, our findings support the latter hypothesis and raise concerns about the use of language models in domains where accurate causal reasoning is essential for informed decision-making.

academic

Do Large Language Models Show Biases in Causal Learning? Insights from Contingency Judgment

基本信息

论文ID: 2510.13985
标题: Do Large Language Models Show Biases in Causal Learning? Insights from Contingency Judgment
作者: María Victoria Carro, Denise Alejandra Mester, Francisca Gauna Selasco, Giovanni Franco Gabriel Marraffini, Mario Alejandro Leiva, Gerardo I. Simari, María Vanina Martinez
分类: cs.AI
发表会议: 39th Conference on Neural Information Processing Systems (NeurIPS 2025) Workshop: First Workshop on CogInterp
论文链接: https://arxiv.org/abs/2510.13985

摘要

因果学习是基于可用信息进行因果推理的认知过程，通常遵循规范性原则。该过程容易出现错误和偏见，如因果错觉，即人们在缺乏支持证据的情况下感知两个变量之间的因果关系。这种认知偏见被认为是许多社会问题的根源，包括社会偏见、刻板印象形成、错误信息和迷信思维。本研究通过经典认知科学范式——偶然性判断任务，检验大型语言模型是否容易产生因果错觉。研究构建了1000个零偶然性场景数据集（其中可用信息不足以建立变量间因果关系），在医疗背景下促使LLMs评估潜在原因的有效性。研究发现所有评估模型都系统性地推断出不当的因果关系，显示出对因果错觉的强烈易感性。

研究背景与动机

问题定义

本研究要解决的核心问题是：大型语言模型在面对经典认知科学范式时是否会表现出与人类相似的因果错觉偏见？

重要性

社会影响：因果错觉是社会偏见、刻板印象、错误信息传播和迷信思维的根源
实际应用：在医疗等关键领域，准确的因果推理对于明智决策至关重要
AI安全：随着LLMs在决策系统中的广泛应用，理解其认知偏见变得极其重要

现有局限性

缺乏对LLMs在偶然性判断任务中表现的系统性评估
对LLMs是否真正"理解"因果关系还是仅仅复制因果语言存在争议
现有研究主要关注相关性到因果性的错误推断，而非零偶然性场景下的因果错觉

研究动机

通过经典的偶然性判断任务评估LLMs的因果推理能力，为理解其认知偏见提供实证证据。

核心贡献

首次将偶然性判断任务适配到LLMs评估：这是首个将实验心理学中的经典偶然性判断任务应用于大型语言模型的研究
构建了大规模零偶然性场景数据集：创建了1000个医疗背景下的零偶然性场景，包含四种变量类型
发现LLMs普遍存在因果错觉：所有评估模型都系统性地在零偶然性场景中推断因果关系
揭示模型间因果判断标准不一致：不同模型采用不同的因果推理标准，缺乏一致性

方法详解

任务定义

偶然性判断任务是认知科学中评估因果学习的经典范式：

输入：一系列试验，每个试验包含潜在原因（存在/不存在）和结果（发生/不发生）
输出：对潜在原因有效性的评分（0-100分，0表示无效，100表示完全有效）
零偶然性条件：结果发生的概率与原因是否存在无关

实验设计

数据集构建

变量类型（4类，共100对变量）：
- 虚构疾病和治疗名称（如"Glimber medicine"和"Drizzlemorn disorder"）
- 不确定变量（如"Disease X"和"Medicine Y"）
- 替代医学和伪医学变量（如"Acupuncture Process"）
- 已验证的科学药物（如"Paracetamol"）
场景生成：
- 1000个零偶然性场景
- 每个场景20-100个试验
- 采用80/20分布控制确保零偶然性