2025-11-18T03:28:13.707066

Ancestor regression in structural vector autoregressive models

Schultheiss, Ulmer, Bühlmann
We present a new method for causal discovery in linear structural vector autoregressive models. We adapt an idea designed for independent observations to the case of time series while retaining its favorable properties, i.e., explicit error control for false causal discovery, at least asymptotically. We apply our method to several real-world bivariate time series datasets and discuss its findings which mostly agree with common understanding. The arrow of time in a model can be interpreted as background knowledge on possible causal mechanisms. Hence, our ideas could be extended to incorporating different background knowledge, even for independent observations.
academic

Ancestor regression in structural vector autoregressive models

Basic Information

  • Paper ID: 2403.03778
  • Title: Ancestor regression in structural vector autoregressive models
  • Authors: Christoph Schultheiss, Markus Ulmer, Peter Bühlmann (ETH Zürich)
  • Classification: stat.ME (Statistics - Methodology)
  • Publication Date: January 3, 2025 (arXiv version)
  • Paper Link: https://arxiv.org/abs/2403.03778

Abstract

This paper proposes a novel method for causal discovery in linear structural vector autoregressive (SVAR) models. The authors extend the ancestor regression method, originally designed for independent observations, to time series settings while maintaining its advantageous property of explicit error control for spurious causal discoveries (at least asymptotically). The method is applied to multiple real-world bivariate time series datasets, with results largely consistent with domain knowledge. The temporal ordering can be interpreted as background knowledge about potential causal mechanisms, enabling extension to incorporate different forms of background knowledge and applicability to independent observations.

Research Background and Motivation

  1. Problem to Address: Real-world datasets typically exhibit temporal structure, violating the independent and identically distributed assumption widely used in causal discovery. This paper aims to address causal discovery in structural vector autoregressive (SVAR) models.
  2. Problem Importance: Time series data are ubiquitous in practical applications, yet traditional causal discovery methods are primarily designed for independent observations. While temporal dependence introduces estimation challenges, it also provides advantages—predictor variables cannot causally influence other variables at earlier time points.
  3. Limitations of Existing Methods:
    • Traditional methods such as LiNGAM are primarily designed for independent observations
    • Lack of explicit error control for causal discovery in time series
    • Existing SVAR extensions lack theoretical guarantees
  4. Research Motivation: Extend the ancestor regression method of Schultheiss and Bühlmann (2023) to multivariate time series, maintaining asymptotic guarantees while handling temporal dependence.

Core Contributions

  1. Method Extension: Extend ancestor regression from independent observations to linear SVAR models, handling both instantaneous and lagged causal relationships
  2. Error Control: Provide asymptotic Type I error guarantees, achieving explicit control over spurious causal discoveries
  3. Adjustment Set Selection: Demonstrate how to select appropriate adjustment sets for different time lags to achieve error control
  4. Network Inference: Propose algorithms for constructing instantaneous effect graphs and summary time graphs
  5. Empirical Validation: Verify method effectiveness on real-world datasets

Methodology Details

Task Definition

Given multivariate time series xt,jx_{t,j} (t = 1,...,T; j = 1,...,d), the goal is to identify causal ancestor relationships between variables, including instantaneous effects (τ=0) and lagged effects (τ>0).

Model Architecture

SVAR Model: xt=τ=0pBτxtτ+ϵtx_t = \sum_{\tau=0}^p B_\tau x_{t-\tau} + \epsilon_t

Where:

  • B0B_0 corresponds to instantaneous effects, assumed to have an acyclic structure
  • BτB_\tau (τ>0) are lagged effect matrices
  • ϵt\epsilon_t are independent innovations

Equivalent Form: xt=τ=1pB~τxtτ+ξtx_t = \sum_{\tau=1}^p \tilde{B}_\tau x_{t-\tau} + \xi_t

Core Algorithm

Ancestor Regression Core Idea: For nonlinear function f(·), use least squares regression: f(ξt,jτ) versus ξtτf(\xi^{\tau}_{t,j}) \text{ versus } \xi_{t-\tau}

Where ξt,jτ\xi^{\tau}_{t,j} and ξtτ\xi_{t-\tau} are residuals after projecting out contributions from earlier time points.

Key Theorem 1: For k ∉ AN_τ(j) (k is not a τ-lagged ancestor of j): βkf,j,τ=E[ztτ,kf(ξt,jτ)]/E[ztτ,k2]=0\beta^{f,j,\tau}_k = E[z_{t-\tau,k}f(\xi^{\tau}_{t,j})]/E[z^2_{t-\tau,k}] = 0

Technical Innovations

  1. Residual Construction: Remove influences from earlier time points through projection, improving signal-to-noise ratio
  2. Lag Adjustment: Construct appropriate adjustment sets for different lags τ
  3. Asymptotic Theory: Establish asymptotic normality based on near-epoch dependence
  4. Network Inference: Recursively construct ancestor relationships, handling cycle detection

Experimental Setup

Datasets

Simulated Data:

  • Number of variables: d = 6, 10, 50
  • SVAR order: p = 1
  • Sample sizes: 10² to 10⁶
  • Error distributions: t₇, uniform, Laplace, mixture of normal distributions
  • Edge weights: uniform distribution, controlling signal-to-noise ratio

Real Data:

  1. Old Faithful Geyser: Waiting time vs. eruption duration (299 observations)
  2. Gas Furnace: Input gas rate vs. output CO₂ concentration (296 observations)
  3. Dairy Prices: Butter vs. cheddar cheese prices (522 observations)

Evaluation Metrics

  • Family-wise error rate (FWER): Family error rate for spurious discoveries
  • Power: Detection rate of true causal relationships
  • p-values: Hypothesis tests based on asymptotic normal distribution

Comparison Methods

  • LiNGAM algorithm (Hyvärinen et al., 2010)
  • Performance comparison under different sample sizes and latent variable settings

Implementation Details

  • Nonlinear function: f(x) = sign(x)|x|³
  • Multiple testing correction: Bonferroni-Holm method
  • Significance level: α = 0.05

Experimental Results

Main Results

Simulation Experiments:

  • For non-ancestor variables, average absolute z-statistics approach theoretical null distribution mean
  • Type I error is controlled across all sample sizes
  • Detection power increases with sample size
  • Lagged ancestors are easier to detect than instantaneous ancestors (stronger signal)

Network Inference:

  • Both instantaneous effect graphs and summary time graphs achieve good ancestor-non-ancestor separation
  • Recursive construction helps detect effects difficult to find individually
  • Nearly perfect performance at large sample sizes

Ablation Studies

Latent Variable Effects:

  • Loss of error control at nominal level when assumptions are violated
  • Still maintains effect size separation between ancestors and non-ancestors
  • p-value ordering still indicates true ancestors

Different Ancestor Types:

  • Direct lagged effects (B~4,k0\tilde{B}_{4,k} \neq 0): Strongest signal
  • Instantaneous ancestors: Moderate signal
  • Lagged ancestors mediated through instantaneous effects: Weakest signal

Case Studies

Old Faithful Geyser:

  • Original data: No significant instantaneous effects detected
  • After temporal adjustment: Detected instantaneous effect from eruption duration → waiting time (p=5×10⁻⁴)
  • Consistent with domain knowledge

Gas Furnace:

  • No instantaneous effects
  • Detected lagged effect from input gas rate → output CO₂ concentration (p=4×10⁻²⁰)

Dairy Prices:

  • Detected lagged effect from butter → cheddar cheese prices (p=5×10⁻¹⁵)
  • No reverse effect found, ruling out hidden confounding hypothesis

Experimental Findings

  1. Method performs well at finite sample sizes
  2. Prior knowledge from temporal structure aids causal inference
  3. Recursive construction significantly improves network inference performance
  4. Exhibits certain robustness to model assumption violations

Main Research Directions

  1. LiNGAM Series: Shimizu et al. (2006) linear non-Gaussian acyclic models and time series extensions
  2. Structural Causal Models: Peters et al. (2013) restricted structural equation models
  3. Ancestor Regression: Schultheiss & Bühlmann (2023) method for independent observations
  • Extends ancestor regression to time series settings
  • Similar identification capability to LiNGAM SVAR extensions, but with error control
  • Higher computational efficiency compared to traditional methods

Comparative Advantages

  • vs LiNGAM: Provides interpretable error control, but slightly lower power
  • vs Traditional Methods: Leverages temporal structure, avoids certain identification issues
  • vs Other SVAR Methods: Stronger theoretical guarantees, simpler implementation

Conclusions and Discussion

Main Conclusions

  1. Successfully extend ancestor regression to SVAR models
  2. Maintain the desirable property of asymptotic Type I error control
  3. Verify method effectiveness on simulated and real data
  4. Provide new theoretical framework for time series causal discovery

Limitations

  1. Model Assumptions: Requires linear relationships and independent innovations
  2. Instantaneous Acyclicity: Assumes instantaneous effects are acyclic, potentially unrealistic
  3. Gaussian Noise: Sensitive to Gaussian noise in adjacent variables
  4. Latent Variables: Loses error control in presence of unobserved variables

Future Directions

  1. Background Knowledge Integration: Extend to more general background knowledge settings
  2. Nonlinear Extensions: Handle nonlinear causal relationships
  3. High-Dimensional Optimization: Improve computational efficiency for high-dimensional time series
  4. Robustness Enhancement: Develop robust methods against model assumption violations

In-Depth Evaluation

Strengths

  1. Theoretical Rigor: Complete asymptotic theory analysis and proofs
  2. Methodological Innovation: Clever exploitation of temporal structure for causal inference
  3. Strong Practicality: Simple computation, easy to implement
  4. Comprehensive Validation: Thorough simulated and real data verification
  5. Clear Writing: Logical flow, accurate mathematical exposition

Weaknesses

  1. Strict Assumptions: Linearity and independence assumptions limit applicability
  2. Power Issues: Lower power than LiNGAM in some scenarios
  3. Limited Real Data: Validation only on bivariate time series
  4. High-Dimensional Challenges: Overly conservative multiple testing correction for large-scale networks

Impact

  1. Theoretical Contribution: Provides new theoretical framework for time series causal discovery
  2. Methodological Value: Important extension of ancestor regression
  3. Practical Value: Provides tools for real time series analysis
  4. Reproducibility: Code publicly available, results reproducible

Applicable Scenarios

  1. Economic Time Series: Causal relationship analysis of macroeconomic variables
  2. Biomedical: Causal inference among physiological signals
  3. Engineering Systems: Causal relationship identification in control systems
  4. Social Sciences: Dynamic causal analysis of social phenomena

References

  1. Schultheiss, C. and Bühlmann, P. (2023). Ancestor regression in linear structural equation models. Biometrika, 110(4):1117–1124.
  2. Shimizu, S., Hoyer, P. O., Hyvärinen, A., Kerminen, A., and Jordan, M. (2006). A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(10).
  3. Peters, J., Janzing, D., and Schölkopf, B. (2013). Causal inference on time series using restricted structural equation models. Advances in neural information processing systems, 26.
  4. Hyvärinen, A., Zhang, K., Shimizu, S., and Hoyer, P. O. (2010). Estimation of a structural vector autoregression model using non-gaussianity. Journal of Machine Learning Research, 11(5).

Overall Assessment: This is a high-quality methodological paper with significant contributions at both theoretical and practical levels. The authors successfully extend an important causal discovery method to time series settings while maintaining the original method's desirable properties. Despite some limitations, it provides valuable tools and theoretical foundations for the time series causal inference field.