2025-11-11T16:04:09.679573

A Mixed Model Approach for Estimating Regional Functional Connectivity from Voxel-level BOLD Signals

Liu, Zhang, Tran et al.

Resting-state brain functional connectivity quantifies the synchrony between activity patterns of different brain regions. In functional magnetic resonance imaging (fMRI), each region comprises a set of spatially contiguous voxels at which blood-oxygen-level-dependent signals are acquired. The ubiquitous Correlation of Averages (CA) estimator, and other similar metrics, are computed from spatially aggregated signals within each region, and remain the quantifications of inter-regional connectivity most used by neuroscientists despite their bias that stems from intra-regional correlation and measurement error. We leverage the framework of linear mixed-effects models to isolate different sources of variability in the voxel-level signals, including both inter-regional and intra-regional correlation and measurement error. A novel computational pipeline, focused on subject-level inter-regional correlation parameters of interest, is developed to address the challenges of applying maximum (or restricted maximum) likelihood estimation to such structured, high-dimensional spatiotemporal data. Simulation results demonstrate the reliability of correlation estimates and their large sample standard error approximations, and their superiority relative to CA. The proposed method is applied to two public fMRI data sets. First, we analyze scans of a dead rat to assess false positive performance when connectivity is absent. Second, individual human brain networks are constructed for subjects from a Human Connectome Project test-retest database. Concordance between inter-regional correlation estimates for test-retest scans of the same subject are shown to be higher for the proposed method relative to CA.

academic

A Mixed Model Approach for Estimating Regional Functional Connectivity from Voxel-level BOLD Signals

Basic Information

Paper ID: 2211.02192
Title: A Mixed Model Approach for Estimating Regional Functional Connectivity from Voxel-level BOLD Signals
Authors: Ruobin Liu, Chao Zhang, Chau Tran, Sophie Achard, Wendy Meiring, Alexander Petersen
Classification: stat.ME (Statistics - Methodology), stat.AP (Statistics - Applications)
Publication Date: November 2022 (arXiv preprint, updated November 2025)
Paper Link: https://arxiv.org/abs/2211.02192

Abstract

This paper addresses the estimation of resting-state brain functional connectivity from functional magnetic resonance imaging (fMRI) data by proposing a novel approach based on linear mixed-effects models. Although the traditional "Correlation of Averages" (CA) estimator is widely used, it suffers from bias issues due to within-region correlations and measurement error. Through a linear mixed-effects model framework, this paper separates different sources of variability in voxel-level signals, including between-region and within-region correlations as well as measurement error. The study develops a novel computational pipeline focusing on individual-level between-region correlation parameter estimation and employs maximum likelihood estimation to address challenges in high-dimensional spatiotemporal data. Simulation results demonstrate the reliability of the correlation estimates and their superiority over the CA method.

Research Background and Motivation

Problem Definition

Core Problem: How to accurately estimate brain region functional connectivity from voxel-level BOLD signals while avoiding bias inherent in traditional methods
Technical Challenges:
- fMRI data exhibits complex spatiotemporal dependencies
- The number of voxels far exceeds the temporal dimension, creating computational challenges
- Within-region spatial correlation and measurement error affect the accuracy of connectivity estimation

Research Significance

Functional connectivity is fundamental to studying pathophysiology of neurodegenerative diseases and disorders of consciousness
Accurate connectivity estimation is critical for both individual and population-level neuroscience research
Bias in existing methods may impact disease diagnosis and individual characterization studies

Limitations of Existing Methods

Problems with the traditional CA estimator:

Bias Issue: Constrained by within-region correlation αⱼ and noise signal ratio βⱼ, leading to estimates biased toward zero
Neglected Dependencies: Fails to account for spatiotemporal dependencies in voxel-level signals
Parameter Constraints: Connectivity parameters are affected by sampling schemes and machine noise, lacking intrinsic properties

Core Contributions

Proposed Novel Statistical Model: A voxel-level BOLD signal modeling framework based on linear mixed-effects models that explicitly distinguishes between-region and within-region variability
Developed Efficient Estimation Methods:
- Two-stage estimation strategy combined with Restricted Maximum Likelihood Estimation (ReML)
- First application of Vecchia likelihood approximation in functional connectivity modeling
Theoretical Guarantees: Provides large-sample properties and asymptotic inference theory for the estimators
Empirical Validation: Verifies method superiority on simulations and real data (dead rat scans, HCP test-retest data)

Detailed Methodology

Task Definition

Input: Wavelet coefficients of voxel-level BOLD signals Xⱼₗₘ, where j=1,...,J are brain regions, l=1,...,Lⱼ are voxels, m=1,...,M are wavelet coefficients Output: Between-region correlation parameters ρⱼⱼ' for constructing functional connectivity networks Constraints: Computational feasibility for high-dimensional spatiotemporal data

Model Architecture

BOLD Mixed-Effects Model

The core model is:

X = Zμ + Uη + γ + ε

Where:

μⱼ: Region fixed effects (regional mean)
ηⱼₘ: Region random effects (inducing between-region dependencies)
γⱼₗₘ: Voxel-level random effects (inducing within-region dependencies)
εⱼₗₘ: Measurement error

Covariance Structure Parameterization

Between-Region Correlation: Var(η) = (SRS) ⊗ A, where R = {ρⱼⱼ'} is the target correlation matrix
Within-Region Structure: Λⱼ = Cⱼ ⊗ Bⱼ (separable spatial-temporal covariance)
Kernel Definitions:
- Spatial kernel: Matérn kernel K(d; ν, φ)
- Temporal kernel: Gaussian kernel H(|m-m'|; τ)

Technical Innovations

Two-Stage Estimation Strategy

Stage 1: Region-specific parameter estimation

Use ReML to estimate parameters for each region θⱼ = kᵧⱼ, σ²ᵧⱼ, φᵧⱼ, τᵧⱼ
Eliminate the influence of region effects through restricted likelihood

Stage 2: Global and between-region parameter estimation

Estimate between-region correlation parameters θ = τη, kη, ρ₁₂, σ²η
Fix Stage 1 estimates and focus on connectivity parameters

Vecchia Approximation

To address computational complexity (O(N³) time, O(N²) memory), the Vecchia likelihood approximation is employed:

p(X) ≈ p(X_π(1)) ∏ᵢ₌₂ᴺ p(X_π(i) | X_π(j), j ∈ Jᵢ)

Computational efficiency is achieved through small conditional sets |Jᵢ|=100

Experimental Setup

Datasets

Simulated Data:
- J=3 brain regions, M=60 wavelet coefficients
- Using spatial coordinates from live rat experiments (L₁=41, L₂=25, L₃=77 voxels)
- Varying signal strength δⱼ ∈ {0.1, 0.5, 0.7} and spatial covariance ψⱼ ∈ {0.2, 0.5, 0.8}
Real Data:
- Dead rat scan data (validating false positive rates)
- HCP test-retest database (42 subjects, J=92 default mode network regions)

Evaluation Metrics

Simulations: Mean Squared Error (MSE), Mean Absolute Deviation (MAD)
HCP Data: Concordance Correlation Coefficient (CCC) for assessing test-retest reliability
Dead Rat Data: False positive rate analysis

Comparison Methods

ρ̂CA: Traditional correlation of averages estimator
ρ̂EBLUE: Correlation based on empirical best linear unbiased estimator
ρ̂ReML: Complete ReML estimator
ρ̂Vecchia: Vecchia approximation estimator

Implementation Details

Kernels: Gaussian kernel H(u;τ) = exp(-τ²u²/2), Matérn-5/2 kernel
Optimization: L-BFGS quasi-Newton method
Vecchia conditional set size: |Jᵢ| = 100
Significance testing: Benjamini-Yekutieli procedure, FDR < 0.2

Experimental Results

Main Results

Simulation Performance

Accuracy: ρ̂ReML exhibits the smallest standard deviation across all settings with medians closest to true values
Bias Analysis:
- When ρ=0.6, CA and EBLUE show significant bias toward zero
- High spatial covariance (ψ=0.8) exacerbates bias in CA and EBLUE
- ρ̂ReML maintains robustness across various settings

Numerical Results Example

Under moderate signal strength (δ=0.5):

Low spatial covariance (ψ=0.2): ρ̂ReML MSE of 0.008-0.025, significantly lower than CA's 0.016-0.033
High spatial covariance (ψ=0.8): Gap widens, ρ̂ReML MSE of 0.012-0.028 versus CA's 0.056-0.194

Ablation Studies

Vecchia Approximation Verification: ρ̂Vecchia performance nearly identical to ρ̂ReML, validating the approximation method
Model Misspecification Robustness: ρ̂ReML outperforms traditional methods under alternative covariance structures
Oracle Estimator Comparison: Two-stage procedure shows minimal performance loss

Real Data Results

Dead Rat Scan Analysis

False Positive Control: ρ̂Vecchia shows no significant edges at 5% significance level, while CA method still identifies significant edges
FDR Control: Both methods show no significant edges after BY adjustment (q<0.2), as expected

HCP Test-Retest Analysis

Consistency Improvement: Mixed model methods show higher CCC for most subjects across all graph construction strategies
Edge Proportion: From 1%-20% edge selection, mixed model methods consistently outperform CA
Statistical Significance: Among top 10% edges, approximately 60-80% of subjects show higher test-retest consistency

Main Research Directions

Voxel-level Modeling: Woolrich et al. (2004) mixed-effects models for task-related activation
Population-level Connectivity: Bowman et al. (2008) Bayesian hierarchical models
Frequency Domain Methods: Kang et al. (2012) frequency domain mixed-effects models
Spatiotemporal Modeling: Castruccio et al. (2018) VAR process approaches

Advantages of This Work

Resting-State Specific: Designed specifically for resting-state data, distinct from task-based studies
Individual Level: Focuses on individual brain network construction rather than population inference
Connectivity Priority: Emphasizes between-region correlation as the primary parameter rather than task effects
Computational Innovation: First application of Vecchia approximation in functional connectivity

Conclusions and Discussion

Main Conclusions

Method Effectiveness: Mixed-effects models significantly improve accuracy and reliability of functional connectivity estimation
Bias Correction: Successfully addresses systematic bias issues in CA estimators
Computational Feasibility: Vecchia approximation makes the method applicable to large-scale data
Practical Value: Demonstrates superior test-retest consistency in real data

Limitations

Computational Complexity: Despite approximation methods, still more computationally intensive than CA
Model Assumptions: Relies on Gaussian assumptions and separable covariance structures
Parameter Estimation: Some smoothing parameters require presetting rather than estimation
Predefined Regions: Depends on predefined brain parcellations rather than data-driven approaches

Future Directions

Subject-Specific Regions: Integrate data-driven region discovery methods
Multi-Scale Modeling: Extend to joint analysis across multiple wavelet scales
Non-Gaussian Extensions: Consider robustness to non-Gaussian distributions
Real-Time Applications: Develop more efficient online estimation algorithms

In-Depth Evaluation

Strengths

Theoretical Rigor: Provides comprehensive statistical framework and asymptotic properties
Methodological Innovation: Cleverly combines mixed-effects models with computational approximation techniques
Comprehensive Experiments: Encompasses simulations, control experiments, and real data validation
Strong Practicality: Addresses real problems in neuroscience
Reproducibility: Provides detailed implementation details and parameter settings

Weaknesses

Computational Overhead: Still carries substantial computational burden compared to traditional methods
Parameter Tuning: Requires numerous hyperparameter choices and model specifications
Scalability: Applicability to larger-scale datasets requires further verification
Biological Interpretation: Lacks in-depth discussion of biological significance of model parameters

Impact

Academic Contribution: Provides new statistical framework for functional connectivity analysis
Practical Value: Directly applicable to clinical and basic neuroscience research
Methodological Influence: Advances development of statistical methods in computational neuroscience
Reproducibility: Detailed methodology description facilitates subsequent research

Applicable Scenarios

Individual Brain Network Analysis: Particularly suitable for studies requiring accurate individual connectivity estimation
Clinical Applications: Disease diagnosis and treatment efficacy assessment
Longitudinal Studies: Research with high test-retest reliability requirements
Large-Scale Data: Analysis of neuroimaging data with high-dimensional spatiotemporal structure

References

The paper cites 63 related references, primarily including:

Achard et al. (2023): Theoretical analysis of between-region correlation estimators
Vecchia (1988): Likelihood approximation methods for spatial processes
Bowman et al. (2008): Bayesian hierarchical modeling of fMRI data
Kang et al. (2012, 2017): Spatiotemporal mixed-effects models
Castruccio et al. (2018): Multi-resolution spatiotemporal models

This paper makes important methodological contributions to fMRI functional connectivity analysis through rigorous statistical modeling and computational innovation, demonstrating high academic value and practical significance.