2025-11-24T14:22:17.661777

Robust Causal Discovery in Real-World Time Series with Power-Laws

Tusoni, Masi, Coletta et al.
Exploring causal relationships in stochastic time series is a challenging yet crucial task with a vast range of applications, including finance, economics, neuroscience, and climate science. Many algorithms for Causal Discovery (CD) have been proposed, but they often exhibit a high sensitivity to noise, resulting in misleading causal inferences when applied to real data. In this paper, we observe that the frequency spectra of typical real-world time series follow a power-law distribution, notably due to an inherent self-organizing behavior. Leveraging this insight, we build a robust CD method based on the extraction of power -law spectral features that amplify genuine causal signals. Our method consistently outperforms state-of-the-art alternatives on both synthetic benchmarks and real-world datasets with known causal structures, demonstrating its robustness and practical relevance.
academic

Robust Causal Discovery in Real-World Time Series with Power-Laws

Basic Information

  • Paper ID: 2507.12257
  • Title: Robust Causal Discovery in Real-World Time Series with Power-Laws
  • Authors: Matteo Tusoni, Giuseppe Masi, Andrea Coletta, Aldo Glielmo, Viviana Arrigoni, Novella Bartolini
  • Classification: cs.LG physics.data-an stat.ML stat.OT
  • Publication Date: October 12, 2025 (arXiv v2)
  • Paper Link: https://arxiv.org/abs/2507.12257

Abstract

Exploring causal relationships in stochastic time series is a challenging yet crucial task with broad applications in finance, economics, neuroscience, and climate science. Although numerous causal discovery (CD) algorithms have been proposed, they are often highly sensitive to noise and prone to producing misleading causal inferences when applied to real-world data. This paper observes that the frequency spectra of typical real-world time series follow power-law distributions, primarily due to inherent self-organized behavior. Based on this insight, we construct a robust causal discovery method based on power-law spectral feature extraction that amplifies genuine causal signals. Our method consistently outperforms state-of-the-art alternatives on synthetic benchmarks and real-world datasets with known causal structures, demonstrating its robustness and practical relevance.

Research Background and Motivation

Problem Definition

This research addresses the causal discovery problem in time series data, namely identifying causal relationships between variables from observational data. Traditional causal discovery methods, particularly those based on Granger causality, exhibit the following limitations when confronted with complex real-world data:

  1. Noise Sensitivity: Traditional methods are highly sensitive to non-Gaussian noise, non-stationarity, and nonlinear perturbations
  2. Assumption Constraints: Reliance on strict assumptions such as noise stationarity and single characteristic scales
  3. Spurious Relationship Detection: Tendency to misidentify noise correlations as causal relationships

Research Motivation

The authors observe that real-world systems universally exhibit power-law spectral characteristics, stemming from:

  • Self-organized behavior of multiple interacting units
  • Scale invariance resulting from the absence of external coordinators
  • Fractal properties and long-range temporal correlations of systems

Based on this observation, the paper proposes leveraging power-law spectral features for more robust causal discovery.

Core Contributions

  1. Proposes PLaCy Framework: A novel causal discovery method based on power-law spectral features
  2. Theoretical Guarantees: Proves the invariance of causal graph structure under frequency domain transformation (Theorem 1)
  3. Experimental Validation: Comprehensive evaluation on synthetic and real datasets, demonstrating superior robustness
  4. Method Generality: Demonstrates the improvement of spectral preprocessing on other causal discovery algorithms

Methodology Details

Task Definition

Given multivariate time series xRL×dx \in \mathbb{R}^{L \times d}, the objective is to infer a directed graph G=(V,E)G = (V, E), where:

  • V={1,2,...,d}V = \{1, 2, ..., d\} represents system variables
  • EV×VE \subseteq V \times V represents the set of causal edges
  • A directed edge (i,j)(i,j) exists if and only if xix_i is a causal cause of xjx_j

Model Architecture

1. Sliding Window Segmentation

Each time series is segmented into overlapping windows of length ll with stride ss: wik=(xi(ks),...,xi(ks+l1))w_i^k = (x_i(k \cdot s), ..., x_i(k \cdot s + l - 1))

2. Spectral Feature Extraction

Discrete Fourier Transform is applied to each window: ϕ(k)=t=0L1x(t)ei2πkt/L\phi(k) = \sum_{t=0}^{L-1} x(t) e^{-i2\pi k t/L}

Spectral magnitude is computed as: A(fk)=ϕ(k)A(f_k) = |\phi(k)|

3. Power-Law Fitting

Linear model fitting in log-log space: logA(f)=aλlogf\log A(f) = a - \lambda \log f

where aa is the intercept parameter and λ>0\lambda > 0 is the spectral exponent.

4. Causal Analysis

Multivariate Granger causality testing is applied to the extracted spectral parameter time series (ai,λi)(a_i, \lambda_i), evaluating the predictive power of (λi,ai)(\lambda_i, a_i) on λj\lambda_j.

Algorithm Flow (PLaCy)

Input: Time series x = (x₁, ..., xₐ), window size l, stride s
Output: Causal graph G

1. Segment each xᵢ into ⌊(L-l)/s⌋+1 sliding windows wᵢᵏ
2. for each i ∈ {1, ..., d} do
3.   for each k ∈ {0, ..., ⌊(L-l)/s⌋} do
4.     Apply DFT to wᵢᵏ to obtain φᵢᵏ
5.     Obtain (aᵢᵏ, λᵢᵏ) through fitting equation (2)
6.   Concatenate (aᵢᵏ, λᵢᵏ) to form time series (aᵢ, λᵢ)
7. for each i,j ∈ {1, ..., d}, i ≠ j do
8.   Gᵢ,ⱼ ← Granger causality test with (aᵢ,λᵢ) as cause and λⱼ as effect
9. return G

Technical Innovations

  1. Frequency Domain Causal Discovery: First systematic utilization of power-law spectral features for causal inference
  2. Adaptive Window Selection: Automatic selection of optimal window length through p-value criterion
  3. Noise Robustness: Spectral fitting serves as a natural denoising step, improving robustness to non-Gaussian fluctuations
  4. Theoretical Foundation: Provides theoretical proof of causal graph invariance under spectral transformation

Experimental Setup

Datasets

Synthetic Datasets

Generated based on generalized Ornstein-Uhlenbeck process across four scenarios: x(t+Δt)=x(t)+Δtτc(μx(t))+(σbϵb(t)+σgaϵga(t)+σgmϵgm(t)x(t))Δtx(t+\Delta t) = x(t) + \frac{\Delta t}{\tau_c}(\mu - x(t)) + (\sigma_b \epsilon_b(t) + \sigma_g^a \epsilon_g^a(t) + \sigma_g^m \epsilon_g^m(t) \cdot x(t))\sqrt{\Delta t}

  • OU(σgm=0\sigma_g^m = 0): Equilibrium without multiplicative noise
  • OU(σgm>0\sigma_g^m > 0): Equilibrium with multiplicative noise
  • ÔU(σgm=0\sigma_g^m = 0): Non-equilibrium without multiplicative noise
  • ÔU(σgm>0\sigma_g^m > 0): Non-equilibrium with multiplicative noise

Real-World Datasets

  1. Rivers Dataset: River water level and precipitation data from three hydrological stations in southern Germany
  2. AirQuality Dataset: PM2.5 pollution monitoring data from multiple Chinese cities

Evaluation Metrics

  • F1 Score: Measures overall performance of causal relationship identification
  • True Negative Rate (TNR): Evaluates the algorithm's ability to exclude spurious associations

Comparison Methods

  • Traditional Methods: Granger Causality, PCMCI, PCMCIΩ
  • Optimization Methods: DYNOTEARS, RCV-VarLiNGAM
  • Deep Learning: Rhino
  • Nonlinear Methods: CCM-Filtering
  • Frequency Domain Methods: BCGeweke, DTF, GewekeNP

Implementation Details

  • Sliding window length: l=50l = 50 (selected through p-value criterion)
  • Stride: s=1s = 1
  • Lag terms: 10
  • Statistical significance threshold: p=0.05p = 0.05

Experimental Results

Main Results

Performance on synthetic datasets (N=5, σga=1.0\sigma_g^a = 1.0):

DatasetPLaCy F1Best Baseline F1PLaCy TNRBest Baseline TNR
OU(σgm=0\sigma_g^m = 0)0.77±0.170.61±0.180.94±0.050.99±0.02
OU(σgm>0\sigma_g^m > 0)0.80±0.170.79±0.110.94±0.060.98±0.03
ÔU(σgm=0\sigma_g^m = 0)0.70±0.170.58±0.180.88±0.090.99±0.02
ÔU(σgm>0\sigma_g^m > 0)0.80±0.170.71±0.130.93±0.070.98±0.03

Real-world dataset results:

DatasetPLaCy F1PLaCy TNRBest Baseline F1Best Baseline TNR
Rivers0.51±0.100.75±0.130.47±0.070.74±0.05
AirQuality0.45±0.040.66±0.070.44±0.010.95±0.02

Key Findings

  1. Multiplicative Noise Robustness: PLaCy performs particularly well in scenarios with multiplicative noise
  2. Non-Equilibrium Adaptability: Maintains good performance even under non-equilibrium initialization conditions
  3. Frequency Domain Advantages: Frequency domain analysis demonstrates superior noise resistance compared to time domain methods
  4. Universal Improvement: Applying spectral preprocessing to methods like PCMCI significantly enhances performance

Ablation Studies

Analysis of window length and stride reveals:

  • Stride of 1 yields optimal performance, capturing short-range causal dependencies
  • Adaptive window length selection through p-value criterion performs best
  • Both excessively short and long windows degrade performance

Traditional Causal Discovery

  • Granger Causality: Classical method based on VAR models
  • Constraint Methods: PC algorithm and its temporal extension PCMCI
  • Optimization Methods: Continuous optimization approaches such as DYNOTEARS

Frequency Domain Causal Analysis

  • Geweke Decomposition: Pioneering work on frequency domain Granger causality
  • DTF Method: Directional analysis based on transfer functions
  • Nonparametric Methods: Direct causal estimation from empirical power spectra

Deep Learning Methods

  • Rhino: Neural network approach for handling historical dependent noise
  • Causal Representation Learning: Causal discovery combining deep learning

Conclusions and Discussion

Main Conclusions

  1. PLaCy achieves more robust causal discovery by leveraging power-law spectral features
  2. The method demonstrates superior performance on both synthetic and real-world data
  3. Frequency domain analysis provides a new perspective for time series causal discovery

Limitations

  1. Slowly Varying Spectra Systems: Limited effectiveness for systems with slowly changing spectral parameters
  2. Short Time Series: Requires sufficiently long sequences for stable spectral estimation
  3. Computational Complexity: Additional spectral analysis overhead compared to simple methods

Future Directions

  1. Extension to non-VAR causal discovery methods
  2. In-depth investigation of statistical parameters of spectral density
  3. Addressing the influence of latent confounders
  4. Development of more efficient online causal discovery algorithms

In-Depth Evaluation

Strengths

  1. Strong Innovation: First systematic application of power-law spectral features to causal discovery
  2. Solid Theory: Provides rigorous theoretical analysis and proofs
  3. Comprehensive Experiments: Covers multiple synthetic scenarios and real-world applications
  4. High Practical Value: Demonstrates significant advantages in noisy environments

Limitations

  1. Scope of Applicability: Primarily applicable to systems with power-law spectral characteristics
  2. Parameter Selection: Selection of parameters such as window length requires empirical judgment
  3. Computational Efficiency: Greater computational overhead compared to simple methods

Impact

  1. Academic Contribution: Provides a new research direction for time series causal discovery
  2. Practical Value: Broad application prospects in domains with power-law characteristics such as finance and climate
  3. Reproducibility: Provides complete algorithm description and open-source code

Applicable Scenarios

  • Financial market data analysis
  • Climate system modeling
  • Neuroscience research
  • Social network analysis
  • Any complex systems with self-organized characteristics

References

The paper cites 51 relevant references covering important works in causal discovery, time series analysis, complex systems, and other related fields, providing a solid theoretical foundation for the research.


Overall Assessment: This is a high-quality research paper that proposes an innovative method in the field of time series causal discovery. By cleverly leveraging the power-law spectral characteristics of real-world systems, it successfully enhances the robustness of causal discovery. The theoretical analysis is rigorous, the experimental design is sound, and the results are convincing. This work provides new tools and perspectives for causal inference in complex systems.