2025-11-19T04:40:13.454898

On the permutation invariance principle for causal estimands

Tong, Li
In many causal inference problems, multiple action variables share the same causal role, such as mediators, factors, network units, or genotypes, yet lack a natural ordering. To avoid ambiguity in interpretation, causal estimands should remain unchanged under relabeling, an implicit principle we refer to as permutation invariance. We formally characterize this principle, analyze its algebraic and combinatorial structure for verification, and present a class of weighted estimands that are permutation-invariant while capturing interactions of all orders. We further provide guidance on selecting weights that yield residual-free estimands, whose inclusion-exclusion sums capture the maximal effect, and extend our results to ratio effect measures.
academic

On the Permutation Invariance Principle for Causal Estimands

Basic Information

  • Paper ID: 2510.11863
  • Title: On the permutation invariance principle for causal estimands
  • Authors: Jiaqi Tong, Fan Li (Yale University School of Public Health)
  • Classification: stat.ME (Statistics - Methodology)
  • Publication Date: October 15, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.11863

Abstract

In many causal inference problems, multiple action variables have the same causal interpretation (such as mediators, factors, network units, or genotypes) but lack a natural ordering. To avoid interpretational ambiguity, causal estimands should remain invariant under relabeling, an implicit principle referred to as permutation invariance. This paper formally characterizes this principle, analyzes verification methods for its algebraic and combinatorial structure, and proposes a class of weighted estimands that are both permutation invariant and capture interactions of all orders. Further guidance is provided for weight selection such that the residual-free estimands' inclusion-exclusion sum captures the maximum effect, with results extended to ratio effect measures.

Research Background and Motivation

Problem Statement

Modern causal inference frequently encounters complex scenarios where multiple variables share the same causal interpretation type, including:

  1. Causal mediation analysis with multiple mediators: Multiple unordered mediator variables
  2. Factorial experiments: Multiple factor variables
  3. Causal inference under network interference: Multiple network units
  4. Mendelian randomization: Multiple genotypes (instrumental variables)

Core Issue

When these variables lack intrinsic ordering, a critical consideration is that causal estimands should be permutation invariant, meaning the estimand definition should not change due to variable relabeling. However, in existing literature:

  • The permutation invariance principle is mentioned only informally (e.g., "symmetric estimands" in Xia and Chan (2022))
  • Formal definitions and systematic investigation are lacking
  • Careless practice may produce estimands dependent on labels, leading to interpretational ambiguity

Research Motivation

To address the label-dependency problem in multivariate causal inference, establish theoretical foundations for permutation invariance, and provide clear guiding principles for practice.

Core Contributions

  1. Theoretical Contribution: First rigorous characterization of the permutation invariance principle, filling a theoretical gap in the literature
  2. Verification Methods: Proposes simple and direct procedures to verify whether a given set of estimands satisfies permutation invariance
  3. Complete Estimand Class: Develops an interpretable, permutation-invariant, complete class of weighted estimands applicable across various causal inference domains
  4. Residual-Free Property: Identifies specific weight choices that produce unique residual-free estimands whose inclusion-exclusion sum captures the maximum effect
  5. Ratio Measure Extension: Extends results to ratio effect measures such as risk ratios and odds ratios

Methodological Details

Task Definition

Given K action variables X = {X₁, ..., Xₖ}, each with two states Xₖ(1) and Xₖ(0), the objective is to define permutation-invariant causal estimands that remain unchanged under variable relabeling.

Algebraic Framework

Fundamental Concepts

  1. Power Set Representation: Uses power set 2^X to index all 2^K states
  2. Equivalence Relation: Defines equivalence relation ~, such that A ~ B if and only if |A| = |B|
  3. Equivalence Classes: A = {B ∈ 2^X : |B| = |A|}, uniquely indexed by cardinality q as q
  4. Quotient Set: Q := {q : 0 ≤ q ≤ K}

Permutation Invariance Definition

Definition 1 (Permutation Matrix): A permutation matrix is a square binary matrix with exactly one 1 in each row and column.

Definition 2 (Permutation Invariance): A contrast vector Δ is permutation invariant if and only if for any induced column permutation matrix Pₒ, there exists a row permutation matrix Pᵣ such that PᵣH = HPₒ.

Verification Algorithm

Theoretical Foundation

Theorem 1: A contrast vector Δ is permutation invariant if and only if for all σ ∈ P, R(HPₒ) = R(H), where R(H) is the multiset of rows generated by matrix H.

Verification Algorithm

Algorithm 1:

  1. Compute H' = HPₒ
  2. For i = 1 to d, set σ(i) = j such that rᵢ = r'ⱼ
  3. Output Pᵣ corresponding to permutation σ

Complete Estimand Class

Weighted Estimand Definition

Definition 3: The interpretable complete estimand class for K action variables is:

ΔY = Σ(T⊆Yᶜ) w(T,Y)[Σ(Z⊆Y) (-1)^|Z| f(Z∪T)]

where w is a normalized weight function.

Two Types of Weights

  1. Permutable Weights: Weights that co-permute with action variables
  2. Invariant Weights: Weights that remain unchanged under action variable permutations

Theorem 2:

  • For permutable weights: The subclass {ΔY : Y ∈ q} is permutation invariant
  • For invariant weights: Additional conditions are required to ensure permutation invariance
  • The complete class {ΔY : ∅ ≠ Y ∈ 2^X} is both permutation invariant and complete

Residual-Free Estimands

Residual-Free Property Definition

Definition 4: An estimand class Δ is residual-free if its inclusion-exclusion sum equals the maximum effect:

Σ(∅≠Y⊆X) (-1)^(|Y|+1) ΔY = f(∅) - f(X)

Uniqueness Result

Theorem 3: For the estimand class with invariant weights, residual equals zero if and only if w(T,Y) = 1(T = ∅) and ΔY = Σ(Z⊆Y) (-1)^|Z| f(Z).

Experimental Setup

Application Domains

The paper primarily validates methods through theoretical examples and mathematical proofs, involving:

  1. Causal Mediation Analysis: Cases with K=2 and K=3 multiple mediators
  2. Factorial Experiments: 2^K factorial designs
  3. Network Interference: Multi-unit network analysis
  4. Mendelian Randomization: Multi-genotype analysis

Verification Methods

  • Algebraic verification: Verifying permutation invariance through matrix operations
  • Combinatorial verification: Using multiset counting methods
  • Case analysis: Detailed calculations for specific K=2,3 cases

Experimental Results

Permutation Invariance Verification

Example 1 vs Example 2:

  • Lange et al. (2014) estimands: Do not satisfy permutation invariance
  • Xia and Chan (2022) exit indirect effects: Satisfy permutation invariance

Weight Selection Effects

Residual-Free Property:

  • Point mass weight w(T,Y) = 1(T = ∅) produces the unique residual-free estimand
  • Other weight choices produce non-zero residual effects

Ratio Measure Extension

Corollaries 1-2 demonstrate:

  • Risk ratios: ΔY = Π(Z⊆Y) f(Z)^((-1)^|Z|)
  • Odds ratios: Corresponding multiplicative structures

Existing Research

  1. Causal Mediation Analysis: Lange et al. (2014), Xia and Chan (2022)
  2. Factorial Experiments: Dasgupta et al. (2015), Zhao and Ding (2022)
  3. Network Interference: Hudgens and Halloran (2008)
  4. Mendelian Randomization: Hartwig et al. (2017)

Contributions of This Paper

  • First formal definition of permutation invariance
  • Unifies estimands across different domains
  • Provides systematic verification and construction methods

Conclusions and Discussion

Main Conclusions

  1. Permutation invariance is a fundamental principle in causal inference
  2. Can be verified through simple multiset counting
  3. A unique class of residual-free estimands exists
  4. Methods are applicable across multiple causal inference domains

Limitations

  1. Currently considers only binary action variables
  2. Theoretical framework requires extension to multi-state cases
  3. Computational complexity in practical applications is insufficiently discussed

Future Directions

  1. Extension to multi-category factorial experiments
  2. Handling ordered treatments in multi-mediator analysis
  3. Development of computationally more efficient algorithms

In-Depth Evaluation

Strengths

  1. Theoretical Rigor: First rigorous mathematical characterization of permutation invariance
  2. Method Generality: Unified framework applicable across multiple causal inference domains
  3. Practical Value: Provides explicit verification algorithms and construction methods
  4. Completeness: Complete theoretical system from definition through verification to construction

Weaknesses

  1. Limited Application Scope: Restricted to binary action variables
  2. Insufficient Empirical Validation: Relies primarily on theoretical proofs, lacking large-scale real data validation
  3. Computational Complexity: Computational efficiency issues for large K values insufficiently addressed

Impact

  1. Theoretical Contribution: Provides important theoretical foundation for causal inference
  2. Practical Guidance: Offers concrete methods to avoid label-dependency
  3. Cross-Domain Application: Unifies methodology across multiple subfields

Applicable Scenarios

  1. Causal analysis with multiple mediators
  2. Experimental design with unordered factors
  3. Causal inference with network data
  4. Mendelian randomization with multiple instrumental variables

References

  1. Xia, F. and Chan, K. C. G. (2022). Decomposition, identification and multiply robust estimation of natural mediation effects with multiple mediators. Biometrika.
  2. Zhao, A. and Ding, P. (2022). Regression-based causal inference with factorial experiments. Biometrika.
  3. Dasgupta, T., Pillai, N. S., and Rubin, D. B. (2015). Causal inference from 2^k factorial designs by using potential outcomes. JRSS-B.
  4. Hudgens, M. G. and Halloran, M. E. (2008). Toward causal inference with interference. JASA.