2025-11-24T18:07:18.072734

A Graph Laplacian Eigenvector-based Pre-training Method for Graph Neural Networks

Dai, Njenga, Madhu et al.

The development of self-supervised graph pre-training methods is a crucial ingredient in recent efforts to design robust graph foundation models (GFMs). Structure-based pre-training methods are under-explored yet crucial for downstream applications which rely on underlying graph structure. In addition, pre-training traditional message passing GNNs to capture global and regional structure is often challenging due to the risk of oversmoothing as network depth increases. We address these gaps by proposing the Laplacian Eigenvector Learning Module (LELM), a novel pre-training module for graph neural networks (GNNs) based on predicting the low-frequency eigenvectors of the graph Laplacian. Moreover, LELM introduces a novel architecture that overcomes oversmoothing, allowing the GNN model to learn long-range interdependencies. Empirically, we show that models pre-trained via our framework outperform baseline models on downstream molecular property prediction tasks.

academic

A Graph Laplacian Eigenvector-based Pre-training Method for Graph Neural Networks

Basic Information

Paper ID: 2509.02803
Title: A Graph Laplacian Eigenvector-based Pre-training Method for Graph Neural Networks
Authors: Howard Dai, Nyambura Njenga, Hiren Madhu, Siddharth Viswanath, Ryan Pellico, Ian Adelstein, Smita Krishnaswamy
Classification: cs.LG (Machine Learning)
Publication Date: October 11, 2025 (arXiv preprint)
Paper Link: https://arxiv.org/abs/2509.02803v2

Abstract

This paper proposes a pre-training method for Graph Neural Networks (GNNs) based on graph Laplacian eigenvectors. Addressing the insufficiency of structured pre-training methods in graph foundation models (GFMs), the authors develop the Laplacian Eigenvector Learning Module (LELM), which performs pre-training by predicting the low-frequency eigenvectors of the graph Laplacian. The method introduces novel architectural designs that overcome the over-smoothing problem, enabling GNN models to learn long-range dependencies. Experiments demonstrate that models pre-trained using this framework outperform baseline models on molecular property prediction tasks.

Research Background and Motivation

Problem Definition

Insufficient Structured Pre-training Methods: Existing GNN pre-training approaches primarily rely on feature reconstruction and contrastive learning, while pre-training methods based on graph structural properties remain relatively unexplored.
Over-smoothing Problem: Traditional message-passing GNNs face challenges in capturing global and regional structures, and tend to exhibit over-smoothing phenomena as network depth increases.
Difficulty in Learning Long-range Dependencies: Existing GNN architectures have limited expressiveness in learning long-range interdependencies within graphs.

Research Significance

Development of graph foundation models requires effective self-supervised pre-training tasks
Structure-aware downstream applications require pre-training methods capable of capturing underlying graph structures
Applications such as molecular property prediction depend on understanding global graph structures

Limitations of Existing Methods

Contrastive Methods: Primarily use Jensen-Shannon estimators or InfoNCE objectives, lacking direct modeling of structural information
Prediction Methods: Mostly focus on graph reconstruction tasks, with few methods based on graph property prediction
Structural Representation Capacity: Existing methods struggle to effectively capture global graph structure information

Core Contributions

Proposes LELM Framework: The first method using graph Laplacian eigenvectors as pre-training targets
Innovative Architectural Design: Introduces graph-level MLP heads enabling GNNs to capture large-scale structures without requiring excessively deep networks
Node Feature Enhancement: Proposes enhanced node features based on graph diffusion operators, overcoming GNN expressiveness limitations
Experimental Validation: Demonstrates method effectiveness on molecular datasets, serving as both a standalone pre-training method and a plug-in for existing pipelines

Methodology Details

Task Definition

Given a graph $G = (V,E)$ , the objective is to pre-train a GNN model to predict the $k$ lowest-frequency eigenvectors $\psi_1, \psi_2, \ldots, \psi_k$ of the graph Laplacian matrix $L = D - A$ , where $L\psi_i = \lambda_i\psi_i$ .

Model Architecture

The LELM framework comprises three core components:

1. Node Feature Enhancement

Wavelet Positional Encoding: Encodes relative position information between nodes

Randomly select two nodes $i, j$ and construct Dirac signals $\delta_i, \delta_j$
Apply wavelet operator $\Psi_k = P^{2^{j-1}} - P^{2^j}$ , where $P = D^{-1}A$ is the diffusion operator
Wavelet positional encoding for node $m$ : $w_m = [w_{m,1} \ldots w_{m,J}]$

Diffusion Dirac Encoding: Encodes local connectivity structure

For each node $m$ , compute $d_{m,k} = \Psi_k(m, \cdot) P(m, \cdot)^T$
Diffusion Dirac encoding: $d_m = [d_{m,1} \ldots d_{m,J}]$

2. Graph-level MLP

Base GNN: Processes the enhanced feature graph to generate node representations
Graph-level Aggregation: Concatenates all node representations into a graph-level vector $Z = [z_1, \ldots, z_n] \in \mathbb{R}^{nd}$
MLP Prediction Head: $\tilde{U} = \text{MLP}(Z)$ outputs predicted eigenvectors

3. Eigenvector Prediction

Orthogonality constraints are imposed via QR decomposition: $\hat{U} = \text{QR}(\tilde{U})$

Loss Function:

Energy Loss: $L_{\text{energy}} = \frac{1}{k}\sum_{i=1}^k \hat{u}_i^T L \hat{u}_i$
Eigenvector Loss: $L_{\text{eigvec}} = \frac{1}{k}\sum_{i=1}^k \|L\hat{u}_i - \lambda_i\hat{u}_i\|$
Total Loss: $L = \alpha \cdot L_{\text{energy}} + \beta \cdot L_{\text{eigvec}}$

Technical Innovations

Graph-level MLP Design: Avoids the problem of node-level MLPs failing to learn long-range interactions
Eigenvector Objectives: Low-frequency Laplacian eigenvectors naturally encode global, regional, and local graph structures
Diffusion Operator Enhancement: Provides structural context information, enhancing GNN expressiveness
Dual Loss Mechanism: Energy loss ensures subspace correctness, eigenvector loss ensures proper ordering

Experimental Setup

Datasets

ZINC-12k: 12,000 molecular graphs
ZINC-250k: 250,000 molecular graphs
QM9: 134,000 molecular graphs with multiple quantum chemical properties

Evaluation Metrics

MAE (Mean Absolute Error): Primary evaluation metric
ROC-AUC: Used for binary classification tasks

Comparison Methods

Baseline Models: Untrained GIN and GPS models
Alternative Pre-training Targets: Node degree, local clustering coefficient, cycle counting, Laplacian eigenvalues
Existing Pre-training Methods: ContextPred, Masking, etc.

Implementation Details

Pre-training Epochs: 100-200 rounds
Fine-tuning Epochs: 150-500 rounds
Number of Eigenvectors: $k = 6$
Loss Weights: $\alpha = 2, \beta = 1$ (main experiments)
Optimizer: Adam
Learning Rate: 0.001

Experimental Results

Main Results

Performance Comparison on ZINC and QM9 Datasets:

Model	ZINC full	ZINC subset	QM9 μ	QM9 α	QM9 εHOMO
GIN + LELM	0.130	0.353	0.484	0.489	0.00353
GIN (baseline)	0.228	0.438	0.472	1.132	0.00386
GPS + LELM	0.104	0.210	0.502	0.592	0.00372
GPS (baseline)	0.150	0.358	0.413	0.718	0.00434

LELM significantly improves performance on most tasks, with particularly notable improvements on the ZINC dataset.

Ablation Studies

Graph-level MLP vs Node-level MLP:

Model	ZINC full	ZINC subset
GIN + LELM (graph-level)	0.130	0.353
GIN + LELM (node-level)	0.152	0.435
GPS + LELM (graph-level)	0.104	0.210
GPS + LELM (node-level)	0.126	0.261

Graph-level MLP significantly outperforms node-level MLP across both architectures.

Comparison of Alternative Structural Pre-training Targets:

Pre-training Target	ZINC full	ZINC subset
LELM	0.130	0.353
Node Degree	0.238	0.471
Local Clustering Coefficient	1.493	1.551
Cycle Counting	0.285	0.420
Laplacian Eigenvalues	0.250	0.520

LELM clearly outperforms other structured pre-training targets.

Enhancement of Existing Pre-training Methods

Adding LELM as a plug-in to existing pre-training pipelines on molecular prediction tasks:

Masking + LELM: Shows improvements across all 5 datasets
ContextPred + LELM: Shows improvements on most tasks

Experimental Findings

Importance of Graph-level Architecture: Graph-level MLP effectively learns long-range dependencies
Superiority of Eigenvectors: Laplacian eigenvectors are more suitable for pre-training than other structural targets
Generalizability: LELM can be combined with existing pre-training methods
Scalability: The method applies to different GNN architectures (GIN, GPS)

Classification of Graph Pre-training Methods

Contrastive Methods:
- Graph-node contrast (Deep Graph Infomax, etc.)
- Subgraph-node contrast (InfoGraph, etc.)
- Subgraph-subgraph contrast (GraphCL, etc.)
Prediction Methods:
- Graph reconstruction (node/edge masking, autoencoders)
- Property prediction (k-hop connectivity, meta-paths)

Applications of Laplacian Eigenvectors

Positional Encoding: Standard positional encoding in graph Transformers
Spectral Graph Neural Networks: Learning filters in the signal domain
Spectral Clustering: Generating low-dimensional embeddings for clustering
Graph Partitioning: Fiedler vector generating optimal graph cuts

Positioning of This Work

LELM is the first property prediction method using graph Laplacian eigenvectors as pre-training targets, filling a gap in structured pre-training methods.

Conclusions and Discussion

Main Conclusions

Effectiveness Validation: LELM significantly improves GNN performance on molecular property prediction tasks
Architectural Innovation: Graph-level MLP effectively addresses the over-smoothing problem
Universal Framework: Can serve as both a standalone method and an enhancement component for existing pipelines
Theoretical Guarantees: Loss function possesses necessary sign and basis invariance properties

Limitations

Unexplored Transfer Learning Capability: Currently validated only on same or related domain datasets
Computational Complexity: Requires Laplacian eigendecomposition, which may be challenging for large graphs
Cross-domain Generalization: Effects on synthetic or cross-domain datasets remain unknown
Statistical Significance: Error bars not reported due to computational cost constraints

Future Directions

Cross-domain Pre-training: Explore pre-training effects on synthetic or cross-domain datasets
Large-scale Applications: Investigate scalability on larger-scale graphs
Theoretical Analysis: Provide deeper analysis of why Laplacian eigenvectors are good pre-training targets
Architecture Optimization: Further optimize graph-level MLP design

In-depth Evaluation

Strengths

Strong Novelty: First application of Laplacian eigenvectors to GNN pre-training with innovative approach
Solid Theoretical Foundation: Laplacian eigenvectors have deep theoretical foundations in graph theory
Clever Architectural Design: Graph-level MLP effectively addresses long-range dependency learning
Comprehensive Experiments: Includes multiple comparative, ablation, and enhancement experiments
Good Generalizability: Compatible with different GNN architectures and existing pre-training methods

Weaknesses

Limited Application Domains: Primarily validated on molecular data; effects on other graph types unknown
Computational Overhead: Eigendecomposition computational cost may limit large-scale applications
Hyperparameter Sensitivity: Lack of systematic analysis for hyperparameter selection (e.g., loss weights)
Insufficient Theoretical Explanation: Lacks deep theoretical analysis of why the method is effective

Impact

Academic Value: Provides new research directions for graph pre-training
Practical Value: Potential value in practical applications such as molecular property prediction
Reproducibility: Complete code and experimental settings provided
Inspirational Value: May inspire more pre-training methods based on graph spectral properties

Applicable Scenarios

Molecular Property Prediction: Validated effective application scenario
Social Network Analysis: Tasks requiring understanding of global structure
Knowledge Graphs: Graph reasoning tasks where structural information is important
Biological Networks: Biological applications such as protein interaction networks

References

The paper cites multiple important related works, including:

Hu et al. (2019): "Strategies for pre-training graph neural networks" - Classical work on graph pre-training
Shaham et al. (2018): "SpectralNet" - Neural network approach to spectral clustering
Dwivedi et al. (2021): "Graph neural networks with learnable structural and positional representations" - Structural positional representation learning
Rampášek et al. (2022): "Recipe for a general, powerful, scalable graph transformer" - GPS architecture

Overall Assessment: This is a high-quality research paper proposing an innovative graph neural network pre-training method. While there is room for improvement in certain aspects, its core ideas are novel, experimental validation is comprehensive, and it makes important contributions to the graph pre-training field. The method's generalizability and extensibility demonstrate good application prospects.