2025-11-19T04:40:13.454898

On the permutation invariance principle for causal estimands

Tong, Li

In many causal inference problems, multiple action variables share the same causal role, such as mediators, factors, network units, or genotypes, yet lack a natural ordering. To avoid ambiguity in interpretation, causal estimands should remain unchanged under relabeling, an implicit principle we refer to as permutation invariance. We formally characterize this principle, analyze its algebraic and combinatorial structure for verification, and present a class of weighted estimands that are permutation-invariant while capturing interactions of all orders. We further provide guidance on selecting weights that yield residual-free estimands, whose inclusion-exclusion sums capture the maximal effect, and extend our results to ratio effect measures.

academic

On the Permutation Invariance Principle for Causal Estimands

Basic Information

Paper ID: 2510.11863
Title: On the permutation invariance principle for causal estimands
Authors: Jiaqi Tong, Fan Li (Yale University School of Public Health)
Classification: stat.ME (Statistics - Methodology)
Publication Date: October 15, 2025 (arXiv preprint)
Paper Link: https://arxiv.org/abs/2510.11863

Abstract

In many causal inference problems, multiple action variables have the same causal interpretation (such as mediators, factors, network units, or genotypes) but lack a natural ordering. To avoid interpretational ambiguity, causal estimands should remain invariant under relabeling, an implicit principle referred to as permutation invariance. This paper formally characterizes this principle, analyzes verification methods for its algebraic and combinatorial structure, and proposes a class of weighted estimands that are both permutation invariant and capture interactions of all orders. Further guidance is provided for weight selection such that the residual-free estimands' inclusion-exclusion sum captures the maximum effect, with results extended to ratio effect measures.

Research Background and Motivation

Problem Statement

Modern causal inference frequently encounters complex scenarios where multiple variables share the same causal interpretation type, including:

Causal mediation analysis with multiple mediators: Multiple unordered mediator variables
Factorial experiments: Multiple factor variables
Causal inference under network interference: Multiple network units
Mendelian randomization: Multiple genotypes (instrumental variables)

Core Issue

When these variables lack intrinsic ordering, a critical consideration is that causal estimands should be permutation invariant, meaning the estimand definition should not change due to variable relabeling. However, in existing literature:

The permutation invariance principle is mentioned only informally (e.g., "symmetric estimands" in Xia and Chan (2022))
Formal definitions and systematic investigation are lacking
Careless practice may produce estimands dependent on labels, leading to interpretational ambiguity

Research Motivation

To address the label-dependency problem in multivariate causal inference, establish theoretical foundations for permutation invariance, and provide clear guiding principles for practice.

Core Contributions

Theoretical Contribution: First rigorous characterization of the permutation invariance principle, filling a theoretical gap in the literature
Verification Methods: Proposes simple and direct procedures to verify whether a given set of estimands satisfies permutation invariance
Complete Estimand Class: Develops an interpretable, permutation-invariant, complete class of weighted estimands applicable across various causal inference domains
Residual-Free Property: Identifies specific weight choices that produce unique residual-free estimands whose inclusion-exclusion sum captures the maximum effect
Ratio Measure Extension: Extends results to ratio effect measures such as risk ratios and odds ratios

Methodological Details

Task Definition

Given K action variables X = {X₁, ..., Xₖ}, each with two states Xₖ(1) and Xₖ(0), the objective is to define permutation-invariant causal estimands that remain unchanged under variable relabeling.

Algebraic Framework

Fundamental Concepts

Power Set Representation: Uses power set 2^X to index all 2^K states
Equivalence Relation: Defines equivalence relation ~, such that A ~ B if and only if |A| = |B|
Equivalence Classes: A = {B ∈ 2^X : |B| = |A|}, uniquely indexed by cardinality q as q
Quotient Set: Q := {q : 0 ≤ q ≤ K}

Permutation Invariance Definition

Definition 1 (Permutation Matrix): A permutation matrix is a square binary matrix with exactly one 1 in each row and column.

Definition 2 (Permutation Invariance): A contrast vector Δ is permutation invariant if and only if for any induced column permutation matrix Pₒ, there exists a row permutation matrix Pᵣ such that PᵣH = HPₒ.

Verification Algorithm

Theoretical Foundation

Theorem 1: A contrast vector Δ is permutation invariant if and only if for all σ ∈ P, R(HPₒ) = R(H), where R(H) is the multiset of rows generated by matrix H.

Verification Algorithm

Algorithm 1:

Compute H' = HPₒ
For i = 1 to d, set σ(i) = j such that rᵢ = r'ⱼ
Output Pᵣ corresponding to permutation σ

Complete Estimand Class

Weighted Estimand Definition

Definition 3: The interpretable complete estimand class for K action variables is:

ΔY = Σ(T⊆Yᶜ) w(T,Y)[Σ(Z⊆Y) (-1)^|Z| f(Z∪T)]

where w is a normalized weight function.

Two Types of Weights

Permutable Weights: Weights that co-permute with action variables
Invariant Weights: Weights that remain unchanged under action variable permutations

Theorem 2:

For permutable weights: The subclass {ΔY : Y ∈ q} is permutation invariant
For invariant weights: Additional conditions are required to ensure permutation invariance
The complete class {ΔY : ∅ ≠ Y ∈ 2^X} is both permutation invariant and complete

Residual-Free Estimands

Residual-Free Property Definition

Definition 4: An estimand class Δ is residual-free if its inclusion-exclusion sum equals the maximum effect:

Σ(∅≠Y⊆X) (-1)^(|Y|+1) ΔY = f(∅) - f(X)

Causal Mediation Analysis: Cases with K=2 and K=3 multiple mediators
Factorial Experiments: 2^K factorial designs
Network Interference: Multi-unit network analysis
Mendelian Randomization: Multi-genotype analysis

Verification Methods

Algebraic verification: Verifying permutation invariance through matrix operations
Combinatorial verification: Using multiset counting methods
Case analysis: Detailed calculations for specific K=2,3 cases

Experimental Results

Permutation Invariance Verification

Example 1 vs Example 2:

Lange et al. (2014) estimands: Do not satisfy permutation invariance
Xia and Chan (2022) exit indirect effects: Satisfy permutation invariance

Weight Selection Effects

Residual-Free Property:

Point mass weight w(T,Y) = 1(T = ∅) produces the unique residual-free estimand
Other weight choices produce non-zero residual effects

Ratio Measure Extension

Corollaries 1-2 demonstrate:

Risk ratios: ΔY = Π(Z⊆Y) f(Z)^((-1)^|Z|)
Odds ratios: Corresponding multiplicative structures

Existing Research

Causal Mediation Analysis: Lange et al. (2014), Xia and Chan (2022)
Factorial Experiments: Dasgupta et al. (2015), Zhao and Ding (2022)
Network Interference: Hudgens and Halloran (2008)
Mendelian Randomization: Hartwig et al. (2017)

Contributions of This Paper

First formal definition of permutation invariance
Unifies estimands across different domains
Provides systematic verification and construction methods

Conclusions and Discussion

Main Conclusions

Permutation invariance is a fundamental principle in causal inference
Can be verified through simple multiset counting
A unique class of residual-free estimands exists
Methods are applicable across multiple causal inference domains

Limitations

Currently considers only binary action variables
Theoretical framework requires extension to multi-state cases
Computational complexity in practical applications is insufficiently discussed

Future Directions

Extension to multi-category factorial experiments
Handling ordered treatments in multi-mediator analysis
Development of computationally more efficient algorithms

In-Depth Evaluation

Strengths

Theoretical Rigor: First rigorous mathematical characterization of permutation invariance
Method Generality: Unified framework applicable across multiple causal inference domains
Practical Value: Provides explicit verification algorithms and construction methods
Completeness: Complete theoretical system from definition through verification to construction

Weaknesses

Limited Application Scope: Restricted to binary action variables
Insufficient Empirical Validation: Relies primarily on theoretical proofs, lacking large-scale real data validation
Computational Complexity: Computational efficiency issues for large K values insufficiently addressed

Impact

Theoretical Contribution: Provides important theoretical foundation for causal inference
Practical Guidance: Offers concrete methods to avoid label-dependency
Cross-Domain Application: Unifies methodology across multiple subfields

Applicable Scenarios

Causal analysis with multiple mediators
Experimental design with unordered factors
Causal inference with network data
Mendelian randomization with multiple instrumental variables

References

Xia, F. and Chan, K. C. G. (2022). Decomposition, identification and multiply robust estimation of natural mediation effects with multiple mediators. Biometrika.
Zhao, A. and Ding, P. (2022). Regression-based causal inference with factorial experiments. Biometrika.
Dasgupta, T., Pillai, N. S., and Rubin, D. B. (2015). Causal inference from 2^k factorial designs by using potential outcomes. JRSS-B.
Hudgens, M. G. and Halloran, M. E. (2008). Toward causal inference with interference. JASA.