2025-11-24T16:10:25.080119

Using Information Geometry to Characterize Higher-Order Interactions in EEG

Albers, Marriott, Tatsuno
In neuroscience, methods from information geometry (IG) have been successfully applied in the modelling of binary vectors from spike train data, using the orthogonal decomposition of the Kullback-Leibler divergence and mutual information to isolate different orders of interaction between neurons. While spike train data is well-approximated with a binary model, here we apply these IG methods to data from electroencephalography (EEG), a continuous signal requiring appropriate discretization strategies. We developed and compared three different binarization methods and used them to identify third-order interactions in an experiment involving imagined motor movements. The statistical significance of these interactions was assessed using phase-randomized surrogate data that eliminated higher-order dependencies while preserving the spectral characteristics of the original signals. We validated our approach by implementing known second- and third-order dependencies in a forward model and quantified information attenuation at different steps of the analysis. This revealed that the greatest loss in information occurred when going from the idealized binary case to enforcing these dependencies using oscillatory signals. When applied to the real EEG dataset, our analysis detected statistically significant third-order interactions during the task condition despite the relatively sparse data (45 trials per condition). This work demonstrates that IG methods can successfully extract genuine higher-order dependencies from continuous neural recordings when paired with appropriate binarization schemes.
academic

Using Information Geometry to Characterize Higher-Order Interactions in EEG

Basic Information

  • Paper ID: 2510.14188
  • Title: Using Information Geometry to Characterize Higher-Order Interactions in EEG
  • Authors: Eric Albers, Paul Marriott, Masami Tatsuno
  • Classification: q-bio.NC (Neurons and Cognition), q-bio.QM (Quantitative Methods)
  • Publication Date: October 16, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.14188

Abstract

This study extends information geometry (IG) methods from traditional binary spike train data to continuous electroencephalography (EEG) signal analysis. Higher-order interactions between neurons are identified through orthogonal decomposition of Kullback-Leibler divergence and mutual information. The research develops three binarization methods to identify third-order interactions in motor imagery experiments and employs phase-randomized surrogate data to assess statistical significance. The validity of the method is verified through forward modeling, quantifying information loss at each analytical step. Results demonstrate that despite relatively sparse data (45 trials per condition), the method can detect statistically significant third-order interactions under task conditions.

Research Background and Motivation

Problem Definition

Traditional neuroscience research primarily focuses on pairwise relationships (second-order interactions) between brain regions, yet the brain as a complex system may exhibit higher-order interactions beyond pairwise relationships. Existing functional connectivity networks constructed from pairwise correlations may fail to fully capture the complexity of brain information processing.

Significance

  1. Theoretical Importance: Understanding whether the brain requires third-order or higher-order interactions to accomplish cognitive functions
  2. Methodological Significance: Extending information geometry methods from discrete spike train data to continuous EEG signals
  3. Applied Value: Providing novel analytical tools for brain-computer interfaces and neurological disease diagnosis

Limitations of Existing Methods

  1. Information Geometry Approaches: Primarily applied to binary spike train data, lacking effective discretization strategies for continuous signals
  2. Traditional EEG Analysis: Primarily based on pairwise correlations, neglecting higher-order dependencies
  3. Statistical Inference: Under sparse data conditions, standard asymptotic tools (e.g., χ² distribution) may be inappropriate

Research Motivation

To extend information geometry methods successfully applied to spike train analysis to EEG data, developing appropriate binarization strategies to capture genuine higher-order dependencies in continuous neural recordings.

Core Contributions

  1. Methodological Innovation: Development of three binarization methods (Sign, Diff, Power) to convert continuous EEG signals into binary representations suitable for information geometry analysis
  2. Validation Framework: Establishment of statistical significance testing methods based on phase-randomized surrogate data
  3. Forward Modeling: Implementation of forward models with known second-order and third-order dependencies, quantifying information loss throughout the analysis process
  4. Empirical Findings: Detection of statistically significant third-order interactions in motor imagery EEG data
  5. Theoretical Insights: Revelation that maximum information loss occurs when implementing dependencies from idealized binary cases to oscillatory signals

Methodology Details

Task Definition

Input: Multi-channel continuous EEG signals Output: First-order, second-order, and third-order mutual information components among channel triplets Constraints: Handling sparse data (45 trials/condition) and discretization challenges of continuous signals

Information Geometry Theoretical Foundation

For three binary variables X₁, X₂, X₃, the joint probability distribution can be expressed as a vector of eight probabilities:

p = (p₀₀₀, p₀₀₁, p₀₁₀, p₀₁₁, p₁₀₀, p₁₀₁, p₁₁₀, p₁₁₁)

Expected parameter η coordinate system:

  • η₁, η₂, η₃: marginal activation rates
  • η₁₂, η₁₃, η₂₃: pairwise activation rates
  • η₁₂₃: triplet activation rate

Natural parameter θ coordinate system defined through log-odds ratios, such as:

θ₁₂₃ = log(p₀₀₁p₀₁₀p₁₀₀p₁₁₁)/(p₁₁₀p₁₀₁p₀₁₁p₀₀₀)

Orthogonal Decomposition of KL Divergence

Using mixed coordinate systems, KL divergence can be orthogonally decomposed as:

D[p : q] = D[p : p̄] + D[p̄ : p̃] + D[p̃ : q]

Where:

  • Dp : p̄: triplet interaction information
  • Dp̄ : p̃: pairwise interaction information
  • Dp̃ : q: activation rate modulation information

Binarization Methods

1. Sign Method

binary_signal = 1 if EEG_signal > 0 else 0

Captures coarse phase information while ignoring amplitude.

2. Diff Method

diff_signal = diff(EEG_signal)
binary_signal = 1 if diff_signal > 0 else 0

Captures phase transition patterns.

3. Power Method

power = EEG_signal²
envelope = moving_average(power, 30_samples)
z_scores = (envelope - mean) / std
binary_signal = 1 if z_scores > 1 else 0

Captures high-amplitude periods independent of phase.

Statistical Significance Testing

Using test statistic:

λ = 2N·D[p : p̄] ~ χ²(1)

Due to sparse data, χ² approximation is poor; non-parametric testing based on IAAFT (Iterative Amplitude Adjusted Fourier Transform) surrogate data is employed instead.

Experimental Setup

Dataset

OpenNeuro Motor Imagery Dataset (Triana-Guzman et al., 2022):

  • Participants: 32 healthy subjects (16 female)
  • Electrodes: 17 electrodes placed according to the International 10-20 system
  • Sampling Rate: 250 Hz
  • Experimental Design:
    • 6 blocks (3 seated, 3 standing)
    • 30 trials per block (15 motor imagery, 15 idle state)
    • Total of 45 trials per condition

Trial Structure:

  1. Fixation (4 sec): Fixate on screen crosshair
  2. Observation (3 sec): Display upcoming task
  3. Imagination (4 sec): Execute mental task (motor imagery or idle state)
  4. Rest (4 sec): Free activity

Data Preprocessing

  1. Filtering: 0.5 Hz high-pass filter, 58-62 Hz notch filter
  2. Artifact Removal: Using ASR (Artifact Subspace Reconstruction) method
  3. Frequency Band Filtering: Divided into Delta (0.5-4 Hz), Theta (4-8 Hz), Alpha (8-12 Hz), Beta (12-30 Hz), Gamma (30-60 Hz)
  4. Epoch Extraction: 11-second epochs from 7 seconds before to 4 seconds after imagination task onset

Evaluation Metrics

  • First-order Mutual Information (I₁): Activation rate modulation information
  • Second-order Mutual Information (I₂): Pairwise interaction information
  • Third-order Mutual Information (I₃): Triplet interaction information
  • Statistical Significance: p < 0.01 (based on IAAFT surrogate data)

Comparison Methods

  1. White Noise Surrogate Data: Effects of pure random structure
  2. IAAFT Surrogate Data: Preserving power spectrum and amplitude distribution while randomizing phase
  3. Different Binarization Methods: Sign vs. Diff vs. Power method comparison

Experimental Results

Main Results

Surrogate Data Validation

  1. IAAFT data produces higher information values than white noise, as expected, since IAAFT preserves power spectral differences between trial phases
  2. Power method shows I₁ > I₂ > I₃ decreasing trend across all frequency bands
  3. Sign and Diff methods show I₂ bias, particularly pronounced in high-frequency bands, limiting their ability to capture third-order interactions

Motor Imagery Data Results

  1. χ² approximation fails: Due to sparse data (45 trials), standard asymptotic distributions are inappropriate
  2. Significant third-order interactions: Statistically significant I₃ detected during observation and imagination phases
  3. False positive control: Approximately 1% of significant triplets during fixation phase, validating null hypothesis appropriateness
  4. Temporal dynamics: Different temporal dynamics of third-order information across frequency bands and triplets

Forward Model Validation Results

Information Loss Quantification

  1. Maximum information loss: Occurs when transitioning from idealized binary signals to oscillatory signals (approximately 50% I₃ loss)
  2. Volume conduction effects are minor: Information loss from source signals to scalp electrodes is relatively small
  3. Noise sensitivity: Both I₂ and I₃ decrease substantially below moderate SNR

Dependency Implementation

Successfully implemented known second-order and third-order dependencies in oscillatory source signals:

  • Second-order case: Target signal correlated, independent of control signal state
  • Third-order case: Target signal correlated when control signal is high (1), anti-correlated when low (0)

Frequency Band-Specific Findings

  • Delta and Theta: I₂ bias less pronounced for Sign and Diff methods
  • Alpha and higher frequencies: Sign and Diff methods show significant I₂ bias, limiting I₃ detection
  • All frequency bands: Power method maintains reasonable I₁ > I₂ > I₃ hierarchical structure

Information Geometry Applications in Neuroscience

  • Amari & Nagaoka (2000): Foundational information geometry theory
  • Nakahara & Amari (2002): Information geometric measurements of neural spike trains
  • Tatsuno et al. (2009): Robust estimation of connection strength and external inputs

EEG Analysis Methods

  • Traditional Methods: Primarily based on power spectral analysis and pairwise correlations
  • Functional Connectivity: Regional relationships based on statistical dependencies as defined by Friston (1995)
  • Network Analysis: Complex brain network analysis by Bullmore & Sporns (2009)

Higher-Order Interaction Research

  • Battiston et al. (2020, 2021): Network structures and dynamics beyond pairwise interactions
  • This Paper's Contribution: First systematic application of information geometry methods to higher-order interaction analysis in EEG data

Conclusions and Discussion

Main Conclusions

  1. Method Feasibility: Information geometry methods can be successfully extended to continuous EEG signal analysis
  2. Importance of Binarization Strategy: Power method is most suitable for detecting higher-order interactions
  3. Genuine Higher-Order Interactions: Statistically significant third-order interactions detected in motor imagery tasks
  4. Information Loss Mechanisms: Primary information loss occurs during the transition from binary to oscillatory signals

Limitations

  1. Computational Complexity: 17 channels approach the feasibility limit; high-density arrays (128-256 channels) may present computational challenges
  2. Temporal Resolution: 1-second non-overlapping windows provide coarse temporal dynamics
  3. Within-Band Analysis: Only considers interactions within the same frequency band, not cross-frequency analysis
  4. Binarization Constraints: May miss more complex nonlinear interaction patterns

Future Directions

  1. Hybrid Binarization: Combining different binarization methods to detect phenomena such as phase-amplitude coupling
  2. Adaptive Windows: Using frequency-adaptive overlapping windows to improve temporal resolution
  3. Cross-Frequency Analysis: Extension to higher-order interactions across different frequency bands
  4. Higher-Order Interactions: Exploration of fourth-order and higher interaction patterns

In-Depth Evaluation

Strengths

  1. Methodological Innovation: Successfully extends information geometry methods from discrete to continuous signal domains
  2. Rigorous Validation: Provides comprehensive validation framework through forward modeling and surrogate data
  3. Practical Value: Provides actionable tools for higher-order interaction analysis in EEG data
  4. Theoretical Contribution: Quantifies information loss at each analytical step

Weaknesses

  1. Sample Size Limitation: 45 trials per condition is relatively small, potentially affecting statistical power
  2. Binarization Simplification: Reducing complex continuous signals to binary may lose important information
  3. Computational Scalability: Computational challenges for high-density EEG arrays not fully addressed
  4. Biological Interpretation: Insufficient discussion of neurobiological significance of detected third-order interactions

Impact

  1. Methodological Impact: Provides novel mathematical tools for higher-order analysis of neural signals
  2. Application Prospects: Applicable to brain-computer interfaces, neurological disease diagnosis, and other fields
  3. Theoretical Value: Advances understanding of complex brain network organization
  4. Reproducibility: Provides open-source code and public datasets supporting result reproduction

Applicable Scenarios

  1. Basic Neuroscience Research: Exploring higher-order organizational principles of brain networks
  2. Clinical Applications: Analysis of higher-order connectivity patterns in neurological diseases
  3. Brain-Computer Interfaces: Extracting richer neural signal features for control
  4. Cognitive Neuroscience: Investigating complex neural interactions in cognitive tasks

References

This paper cites 28 important references, primarily including:

  1. Information Geometry Foundations: Amari & Nagaoka (2000), Amari (2001)
  2. Neuroscience Applications: Nakahara & Amari (2002), Tatsuno et al. (2009)
  3. EEG Methodology: Delorme & Makeig (2004), Oostenveld et al. (2011)
  4. Higher-Order Networks: Battiston et al. (2020, 2021)
  5. Data Sources: Triana-Guzman et al. (2022)

Overall Assessment: This is a high-quality methodological paper that successfully extends information geometry theory to EEG signal analysis. While presenting some limitations in computational scalability and biological interpretation, its rigorous validation framework and innovative binarization strategies provide important theoretical and practical contributions to higher-order interaction analysis of neural signals.