2025-11-11T16:04:09.679573

A Mixed Model Approach for Estimating Regional Functional Connectivity from Voxel-level BOLD Signals

Liu, Zhang, Tran et al.
Resting-state brain functional connectivity quantifies the synchrony between activity patterns of different brain regions. In functional magnetic resonance imaging (fMRI), each region comprises a set of spatially contiguous voxels at which blood-oxygen-level-dependent signals are acquired. The ubiquitous Correlation of Averages (CA) estimator, and other similar metrics, are computed from spatially aggregated signals within each region, and remain the quantifications of inter-regional connectivity most used by neuroscientists despite their bias that stems from intra-regional correlation and measurement error. We leverage the framework of linear mixed-effects models to isolate different sources of variability in the voxel-level signals, including both inter-regional and intra-regional correlation and measurement error. A novel computational pipeline, focused on subject-level inter-regional correlation parameters of interest, is developed to address the challenges of applying maximum (or restricted maximum) likelihood estimation to such structured, high-dimensional spatiotemporal data. Simulation results demonstrate the reliability of correlation estimates and their large sample standard error approximations, and their superiority relative to CA. The proposed method is applied to two public fMRI data sets. First, we analyze scans of a dead rat to assess false positive performance when connectivity is absent. Second, individual human brain networks are constructed for subjects from a Human Connectome Project test-retest database. Concordance between inter-regional correlation estimates for test-retest scans of the same subject are shown to be higher for the proposed method relative to CA.
academic

A Mixed Model Approach for Estimating Regional Functional Connectivity from Voxel-level BOLD Signals

Basic Information

  • Paper ID: 2211.02192
  • Title: A Mixed Model Approach for Estimating Regional Functional Connectivity from Voxel-level BOLD Signals
  • Authors: Ruobin Liu, Chao Zhang, Chau Tran, Sophie Achard, Wendy Meiring, Alexander Petersen
  • Classification: stat.ME (Statistics - Methodology), stat.AP (Statistics - Applications)
  • Publication Date: November 2022 (arXiv preprint, updated November 2025)
  • Paper Link: https://arxiv.org/abs/2211.02192

Abstract

This paper addresses the estimation of resting-state brain functional connectivity from functional magnetic resonance imaging (fMRI) data by proposing a novel approach based on linear mixed-effects models. Although the traditional "Correlation of Averages" (CA) estimator is widely used, it suffers from bias issues due to within-region correlations and measurement error. Through a linear mixed-effects model framework, this paper separates different sources of variability in voxel-level signals, including between-region and within-region correlations as well as measurement error. The study develops a novel computational pipeline focusing on individual-level between-region correlation parameter estimation and employs maximum likelihood estimation to address challenges in high-dimensional spatiotemporal data. Simulation results demonstrate the reliability of the correlation estimates and their superiority over the CA method.

Research Background and Motivation

Problem Definition

  1. Core Problem: How to accurately estimate brain region functional connectivity from voxel-level BOLD signals while avoiding bias inherent in traditional methods
  2. Technical Challenges:
    • fMRI data exhibits complex spatiotemporal dependencies
    • The number of voxels far exceeds the temporal dimension, creating computational challenges
    • Within-region spatial correlation and measurement error affect the accuracy of connectivity estimation

Research Significance

  • Functional connectivity is fundamental to studying pathophysiology of neurodegenerative diseases and disorders of consciousness
  • Accurate connectivity estimation is critical for both individual and population-level neuroscience research
  • Bias in existing methods may impact disease diagnosis and individual characterization studies

Limitations of Existing Methods

Problems with the traditional CA estimator:

  1. Bias Issue: Constrained by within-region correlation αⱼ and noise signal ratio βⱼ, leading to estimates biased toward zero
  2. Neglected Dependencies: Fails to account for spatiotemporal dependencies in voxel-level signals
  3. Parameter Constraints: Connectivity parameters are affected by sampling schemes and machine noise, lacking intrinsic properties

Core Contributions

  1. Proposed Novel Statistical Model: A voxel-level BOLD signal modeling framework based on linear mixed-effects models that explicitly distinguishes between-region and within-region variability
  2. Developed Efficient Estimation Methods:
    • Two-stage estimation strategy combined with Restricted Maximum Likelihood Estimation (ReML)
    • First application of Vecchia likelihood approximation in functional connectivity modeling
  3. Theoretical Guarantees: Provides large-sample properties and asymptotic inference theory for the estimators
  4. Empirical Validation: Verifies method superiority on simulations and real data (dead rat scans, HCP test-retest data)

Detailed Methodology

Task Definition

Input: Wavelet coefficients of voxel-level BOLD signals Xⱼₗₘ, where j=1,...,J are brain regions, l=1,...,Lⱼ are voxels, m=1,...,M are wavelet coefficients Output: Between-region correlation parameters ρⱼⱼ' for constructing functional connectivity networks Constraints: Computational feasibility for high-dimensional spatiotemporal data

Model Architecture

BOLD Mixed-Effects Model

The core model is:

X = Zμ + Uη + γ + ε

Where:

  • μⱼ: Region fixed effects (regional mean)
  • ηⱼₘ: Region random effects (inducing between-region dependencies)
  • γⱼₗₘ: Voxel-level random effects (inducing within-region dependencies)
  • εⱼₗₘ: Measurement error

Covariance Structure Parameterization

  1. Between-Region Correlation: Var(η) = (SRS) ⊗ A, where R = {ρⱼⱼ'} is the target correlation matrix
  2. Within-Region Structure: Λⱼ = Cⱼ ⊗ Bⱼ (separable spatial-temporal covariance)
  3. Kernel Definitions:
    • Spatial kernel: Matérn kernel K(d; ν, φ)
    • Temporal kernel: Gaussian kernel H(|m-m'|; τ)

Technical Innovations

Two-Stage Estimation Strategy

Stage 1: Region-specific parameter estimation

  • Use ReML to estimate parameters for each region θⱼ = kᵧⱼ, σ²ᵧⱼ, φᵧⱼ, τᵧⱼ
  • Eliminate the influence of region effects through restricted likelihood

Stage 2: Global and between-region parameter estimation

  • Estimate between-region correlation parameters θ = τη, kη, ρ₁₂, σ²η
  • Fix Stage 1 estimates and focus on connectivity parameters

Vecchia Approximation

To address computational complexity (O(N³) time, O(N²) memory), the Vecchia likelihood approximation is employed:

p(X) ≈ p(X_π(1)) ∏ᵢ₌₂ᴺ p(X_π(i) | X_π(j), j ∈ Jᵢ)

Computational efficiency is achieved through small conditional sets |Jᵢ|=100

Experimental Setup

Datasets

  1. Simulated Data:
    • J=3 brain regions, M=60 wavelet coefficients
    • Using spatial coordinates from live rat experiments (L₁=41, L₂=25, L₃=77 voxels)
    • Varying signal strength δⱼ ∈ {0.1, 0.5, 0.7} and spatial covariance ψⱼ ∈ {0.2, 0.5, 0.8}
  2. Real Data:
    • Dead rat scan data (validating false positive rates)
    • HCP test-retest database (42 subjects, J=92 default mode network regions)

Evaluation Metrics

  1. Simulations: Mean Squared Error (MSE), Mean Absolute Deviation (MAD)
  2. HCP Data: Concordance Correlation Coefficient (CCC) for assessing test-retest reliability
  3. Dead Rat Data: False positive rate analysis

Comparison Methods

  1. ρ̂CA: Traditional correlation of averages estimator
  2. ρ̂EBLUE: Correlation based on empirical best linear unbiased estimator
  3. ρ̂ReML: Complete ReML estimator
  4. ρ̂Vecchia: Vecchia approximation estimator

Implementation Details

  • Kernels: Gaussian kernel H(u;τ) = exp(-τ²u²/2), Matérn-5/2 kernel
  • Optimization: L-BFGS quasi-Newton method
  • Vecchia conditional set size: |Jᵢ| = 100
  • Significance testing: Benjamini-Yekutieli procedure, FDR < 0.2

Experimental Results

Main Results

Simulation Performance

  1. Accuracy: ρ̂ReML exhibits the smallest standard deviation across all settings with medians closest to true values
  2. Bias Analysis:
    • When ρ=0.6, CA and EBLUE show significant bias toward zero
    • High spatial covariance (ψ=0.8) exacerbates bias in CA and EBLUE
    • ρ̂ReML maintains robustness across various settings

Numerical Results Example

Under moderate signal strength (δ=0.5):

  • Low spatial covariance (ψ=0.2): ρ̂ReML MSE of 0.008-0.025, significantly lower than CA's 0.016-0.033
  • High spatial covariance (ψ=0.8): Gap widens, ρ̂ReML MSE of 0.012-0.028 versus CA's 0.056-0.194

Ablation Studies

  1. Vecchia Approximation Verification: ρ̂Vecchia performance nearly identical to ρ̂ReML, validating the approximation method
  2. Model Misspecification Robustness: ρ̂ReML outperforms traditional methods under alternative covariance structures
  3. Oracle Estimator Comparison: Two-stage procedure shows minimal performance loss

Real Data Results

Dead Rat Scan Analysis

  • False Positive Control: ρ̂Vecchia shows no significant edges at 5% significance level, while CA method still identifies significant edges
  • FDR Control: Both methods show no significant edges after BY adjustment (q<0.2), as expected

HCP Test-Retest Analysis

  • Consistency Improvement: Mixed model methods show higher CCC for most subjects across all graph construction strategies
  • Edge Proportion: From 1%-20% edge selection, mixed model methods consistently outperform CA
  • Statistical Significance: Among top 10% edges, approximately 60-80% of subjects show higher test-retest consistency

Main Research Directions

  1. Voxel-level Modeling: Woolrich et al. (2004) mixed-effects models for task-related activation
  2. Population-level Connectivity: Bowman et al. (2008) Bayesian hierarchical models
  3. Frequency Domain Methods: Kang et al. (2012) frequency domain mixed-effects models
  4. Spatiotemporal Modeling: Castruccio et al. (2018) VAR process approaches

Advantages of This Work

  1. Resting-State Specific: Designed specifically for resting-state data, distinct from task-based studies
  2. Individual Level: Focuses on individual brain network construction rather than population inference
  3. Connectivity Priority: Emphasizes between-region correlation as the primary parameter rather than task effects
  4. Computational Innovation: First application of Vecchia approximation in functional connectivity

Conclusions and Discussion

Main Conclusions

  1. Method Effectiveness: Mixed-effects models significantly improve accuracy and reliability of functional connectivity estimation
  2. Bias Correction: Successfully addresses systematic bias issues in CA estimators
  3. Computational Feasibility: Vecchia approximation makes the method applicable to large-scale data
  4. Practical Value: Demonstrates superior test-retest consistency in real data

Limitations

  1. Computational Complexity: Despite approximation methods, still more computationally intensive than CA
  2. Model Assumptions: Relies on Gaussian assumptions and separable covariance structures
  3. Parameter Estimation: Some smoothing parameters require presetting rather than estimation
  4. Predefined Regions: Depends on predefined brain parcellations rather than data-driven approaches

Future Directions

  1. Subject-Specific Regions: Integrate data-driven region discovery methods
  2. Multi-Scale Modeling: Extend to joint analysis across multiple wavelet scales
  3. Non-Gaussian Extensions: Consider robustness to non-Gaussian distributions
  4. Real-Time Applications: Develop more efficient online estimation algorithms

In-Depth Evaluation

Strengths

  1. Theoretical Rigor: Provides comprehensive statistical framework and asymptotic properties
  2. Methodological Innovation: Cleverly combines mixed-effects models with computational approximation techniques
  3. Comprehensive Experiments: Encompasses simulations, control experiments, and real data validation
  4. Strong Practicality: Addresses real problems in neuroscience
  5. Reproducibility: Provides detailed implementation details and parameter settings

Weaknesses

  1. Computational Overhead: Still carries substantial computational burden compared to traditional methods
  2. Parameter Tuning: Requires numerous hyperparameter choices and model specifications
  3. Scalability: Applicability to larger-scale datasets requires further verification
  4. Biological Interpretation: Lacks in-depth discussion of biological significance of model parameters

Impact

  1. Academic Contribution: Provides new statistical framework for functional connectivity analysis
  2. Practical Value: Directly applicable to clinical and basic neuroscience research
  3. Methodological Influence: Advances development of statistical methods in computational neuroscience
  4. Reproducibility: Detailed methodology description facilitates subsequent research

Applicable Scenarios

  1. Individual Brain Network Analysis: Particularly suitable for studies requiring accurate individual connectivity estimation
  2. Clinical Applications: Disease diagnosis and treatment efficacy assessment
  3. Longitudinal Studies: Research with high test-retest reliability requirements
  4. Large-Scale Data: Analysis of neuroimaging data with high-dimensional spatiotemporal structure

References

The paper cites 63 related references, primarily including:

  • Achard et al. (2023): Theoretical analysis of between-region correlation estimators
  • Vecchia (1988): Likelihood approximation methods for spatial processes
  • Bowman et al. (2008): Bayesian hierarchical modeling of fMRI data
  • Kang et al. (2012, 2017): Spatiotemporal mixed-effects models
  • Castruccio et al. (2018): Multi-resolution spatiotemporal models

This paper makes important methodological contributions to fMRI functional connectivity analysis through rigorous statistical modeling and computational innovation, demonstrating high academic value and practical significance.