A Spatio-temporal CP decomposition analysis of New England region in the US
Sanogo
Spatio temporal data consist of measurement for one or more raster fields such as weather, traffic volume, crime rate, or disease incidents. Advances in modern technology have increased the number of available information for this type of data hence the rise of multidimensional data. In this paper we take advantage of the multidimensional structure of the data but also its temporal and spatial structure. In fact, we will be using the NCAR Climate Data Gateway website which provides data discovery and access services for global and regional climate model data. The daily values of total precipitation (prec), maximum (tmax), and minimum (tmin) temperature are combined to create a multidimensional data called tensor (a multidimensional array). In this paper, we propose a spatio temporal principal component analysis to initialize CP decomposition component. We take full advantage of the spatial and temporal structure of the data in the initialization step for cp component analysis. The performance of our method is tested via comparison with most popular initialization method. We also run a clustering analysis to further show the performance of our analysis.
academic
A Spatio-temporal CP decomposition analysis of New England region in the US
Spatio-temporal data comprise measurements of one or more gridded fields, such as weather, traffic flow, crime rates, or disease incidence. Advances in modern technology have increased the volume of available information in such datasets, resulting in multidimensional data. This paper leverages the multidimensional structure of data along with its temporal and spatial characteristics. Using global and regional climate model data provided by the NCAR Climate Data Gateway website, the authors construct a multidimensional data tensor by combining daily values of total precipitation (prec), maximum temperature (tmax), and minimum temperature (tmin). The paper proposes spatio-temporal principal component analysis to initialize CP decomposition components, fully exploiting the spatial and temporal structure of the data during the initialization step of CP component analysis.
Problem to be Addressed: Traditional tensor decomposition methods (such as CP decomposition) lack initialization strategies specifically tailored to spatio-temporal correlations when processing climate spatio-temporal data, resulting in poor factor identifiability and low reconstruction accuracy.
Problem Significance:
Global climate change leads to frequent extreme weather events, necessitating more reliable prediction and diagnostic tools
Numerical Earth system models face challenges of lengthy computation times and exponential growth in data dimensionality
Statistical and machine learning methods are needed to complement physics-based models
Limitations of Existing Methods:
Although PCA can extract dominant variance modes, it processes variables independently and imposes orthogonality constraints, lacking physical interpretability
Random initialization and HOSVD initialization do not account for the inherent structure of spatio-temporal data
Existing tensor decomposition methods have limited applications in climate research
Research Motivation: Develop CP decomposition initialization strategies that specifically exploit the spatio-temporal correlations in climate data to improve factor identifiability and reconstruction accuracy.
Proposed a novel initialization procedure: Enhances the reconstruction quality and interpretability of CP decomposition by leveraging spatio-temporal correlations
Constructed empirical evaluation on NCAR precipitation and temperature datasets: Provides benchmark comparisons with common initialization methods
Performed clustering analysis: Demonstrates the interpretive value and model performance of CP-derived factors
Provided a theoretical framework for spatio-temporal tensor decomposition: Offers a scalable analytical framework for climate data analysis
Given a three-dimensional tensor X∈RI×J×K, where I is the temporal dimension, J is the spatial dimension, and K is the variable dimension, the objective is to find the optimal CP decomposition:
X=∑r=1Rar∘br∘cr=[[A,B,C]]
Data Transformation: Converts the data matrix into a multivariate functional data set through Fourier basis transformation:
ϕ0(t)=T1,ϕ2j−1(t)=T2sin(T2πjt),ϕ2j(t)=T2cos(T2πjt)
Spatial Weight Matrix: Employs Moran's index combined with spatial weight matrix W to obtain the spatial correlation matrix
Feature Extraction: Extracts eigenvalues that can be either positive or negative along with their corresponding spatio-temporal principal components
Spatio-temporal Structure-Aware Initialization: First explicitly incorporates spatio-temporal correlations into the CP decomposition initialization process
Multi-scale Feature Extraction: Simultaneously captures temporal and spatial patterns through Fourier transformation and spatial weight matrices
Elimination of Additional Diagonalization Steps: Avoids the SimDiag step compared to TASD methods, improving computational efficiency
Tensor Decomposition Methods: CP decomposition was first introduced by Hitchcock (1927) and later developed by Carroll and Chang (1970) and Harshman (1970)
Spatial PCA: Principal component analysis methods that account for spatial autocorrelation
Climate Data Analysis: Applications of Empirical Orthogonal Function (EOF) analysis in climate science
Deep Learning Methods: Applications of convolutional neural networks and graph neural networks in climate modeling
Methodological Innovation: First explicitly incorporates spatio-temporal correlations into CP decomposition initialization with clear theoretical motivation
Experimental Comprehensiveness: Conducts comprehensive comparative experiments and clustering analysis on real climate data
Result Convincingness: Achieves consistent performance improvements across multiple evaluation metrics
Practical Value: Provides new tools and perspectives for climate data analysis