Graph Prompt Learning (GPL) serves as an effective paradigm connecting graph pretrained models to downstream tasks, alleviating label dependency and upstream-downstream task mismatch issues. Although existing GPL research explores various prompt strategies, their effectiveness and underlying mechanisms remain unclear. This paper identifies two critical limitations: (1) lack of consensus on underlying mechanisms—different strategies intervene in different model spaces (input-level, layer-level, representation-level); (2) limited scenario adaptability—most methods struggle to generalize when facing data distribution shifts. Through theoretical analysis, this paper reveals that representation-level prompting is essentially equivalent to fine-tuning a simple downstream classifier, proposing that graph prompt learning should focus on unleashing pretrained model capabilities while allowing classifiers to adapt to downstream scenarios. Based on this finding, the UniPrompt method is proposed, capable of adapting to any pretrained model and achieving superior performance in both in-domain and out-of-domain scenarios.
Graph prompt learning aims to address the mismatch between graph pretrained models and downstream tasks, but existing methods face two key challenges:
Through motivational experiments, the paper discovers that existing representation-level prompting methods (e.g., GPPT, GraphPrompt) exhibit unstable performance when switching pretrained models, sometimes even underperforming simple linear probing. This suggests existing methods may fall into a "pseudo-adaptation" trap.
Given a graph , where is the node set, is the edge set, is the feature matrix, and is the label set. The objective is to optimize the prediction function through learnable prompt parameters while freezing the pretrained encoder :
Theorem 4.1: Given a linear prompt function and classifier , there exists an equivalent linear classifier such that .
This theoretical result indicates that representation-level prompting is equivalent to linear probing in both function space and optimization objectives, suggesting focus should shift toward input-level prompting.
Initialize the prompt graph using kNN:
S_{ij}, & \text{if } S_{ij} \in \text{top-k}\{S_{i \cdot}\} \\ 0, & \text{otherwise} \end{cases}$$ where similarity is computed as: $S_{ij} = \frac{x_i x_j^T}{\|x_i\|_2 \|x_j\|_2}$ #### 2. Parameterization Mechanism Introduce learnable weights $w_{ij}$ for each edge using a gating mechanism: $$\tilde{A}_{ij} = \text{ELU}(w_{ij} \cdot \alpha - \alpha) + 1$$ #### 3. Bootstrap Ensemble Employ an iterative update strategy to prevent model collapse: $$\hat{A}^{(t)} = \tau \hat{A}^{(t-1)} + (1-\tau) \tilde{A}$$ where $\hat{A}^{(0)} = A$ and $\tau \in [0,1]$ controls the balance between original and prompt graphs. #### 4. Optimization Objective Jointly optimize prompt parameters and classifier: $$\min_{\phi, \Psi} \frac{1}{|V_L|} \sum_{v_i \in V_L} \ell_D(g_\phi(f_\theta(p_\Psi(A,X))_i), y_i)$$ ## Experimental Setup ### Datasets Nine node classification datasets are used: - **Homophilic Graphs**: Cora, CiteSeer, PubMed - **Heterophilic Graphs**: Cornell, Texas, Wisconsin, Chameleon, Actor, Squirrel ### Evaluation Metrics - **Accuracy**: Node classification accuracy - **Few-Shot Settings**: 1-shot, 3-shot, 5-shot learning ### Comparison Methods - **Baseline Methods**: Fine-tune, Linear-probe - **GPL Methods**: GPPT, GraphPrompt, All-in-one, GPF/GPF+, EdgePrompt/EdgePrompt+ - **Pretrained Models**: DGI, GRACE, GraphMAE ### Implementation Details - Use 2-layer GCN/GAT as backbone network - Train for 2000 epochs with early stopping patience of 20 - 5 random seeds × 20 repeated experiments ## Experimental Results ### Main Results #### 1-Shot In-Domain Node Classification Achieves significant improvements on heterophilic graphs: - Cornell: Improves from best baseline 34.56% to 51.13% on DGI - Texas: Improves from best baseline 37.50% to 48.21% - Wisconsin: Improves from best baseline 33.91% to 58.75% #### Cross-Domain Node Classification Under 1-shot cross-domain settings: - PubMed: Improves from 46.84% to 55.01% - Cornell: Improves from 40.77% to 51.58% ### Ablation Studies Validates key components through replacement experiments: - **Random_Topo**: Replacing kNN with random topology causes performance degradation - **Simple_Add**: Simple addition replacing bootstrap strategy leads to overfitting - **Discard_Topo**: Completely discarding original graph causes significant performance drop on homophilic graphs ### Hyperparameter Analysis - **τ Parameter**: Heterophilic graphs benefit from smaller τ values (0.999-0.9999), while homophilic graphs show stable performance at τ≥0.9999 - **k Parameter**: Sparse heterophilic graphs benefit most, while dense and homophilic graphs remain relatively stable ### Computational Overhead - Preprocessing time: approximately 1.3 seconds - Training time per epoch increases moderately - GPU memory usage remains acceptable ## Related Work ### Graph Pretraining - **Contrastive Learning Methods**: DGI, GRACE, GraphCL, etc., learn representations by maximizing mutual information - **Generative Methods**: GraphMAE and similar approaches learn representations through masked reconstruction ### Graph Prompt Learning - **Input-Level Prompting**: GPF series methods add prompt vectors in feature space - **Representation-Level Prompting**: GPPT, GraphPrompt, etc., add prompts at output layers - **Layer-Level Prompting**: Integrate prompt information across GNN layers ### Graph Foundation Models Recent developments in graph foundation models provide new application scenarios and challenges for GPL. ## Conclusions and Discussion ### Main Conclusions 1. **Theoretical Insight**: Representation-level prompting is equivalent to linear classifier fine-tuning; focus should be on input-level prompting 2. **Design Principles**: Prompts should unleash pretrained model capabilities while classifiers adapt to downstream tasks 3. **Practical Method**: UniPrompt achieves universal model adaptation through adaptive topological prompting ### Limitations 1. **LLM Integration Limitation**: Does not explore integration with large language models 2. **Hyperparameter Dependency**: τ and k parameters require tuning for different dataset types 3. **Limited Task Coverage**: Primarily evaluates node classification; other graph tasks require validation 4. **Noise Sensitivity**: Relatively sensitive to feature noise ### Future Directions 1. Extend to graph classification, link prediction, and other tasks 2. Combine with LLMs to build more powerful graph foundation models 3. Improve robustness to noise and distribution shifts 4. Explore automatic hyperparameter selection mechanisms ## In-Depth Evaluation ### Strengths 1. **Outstanding Theoretical Contribution**: First theoretically unifies understanding of different prompting mechanisms, providing important insights 2. **Clever Method Design**: Bootstrap ensemble strategy effectively prevents model collapse; kNN initialization reasonably leverages feature similarity 3. **Comprehensive Experiments**: Covers multiple pretrained models, dataset types, and evaluation settings 4. **High Practical Value**: Simple and effective method, easy to implement and deploy ### Weaknesses 1. **Limited Theoretical Analysis**: Primarily addresses linear cases; analysis of nonlinear prompting is insufficient 2. **Computational Overhead**: kNN construction and iterative updates increase computational cost 3. **Parameter Sensitivity**: Key hyperparameters require careful tuning with limited automation 4. **Noise Robustness**: Performance significantly degrades under feature noise ### Impact 1. **Academic Value**: Provides important theoretical foundation and design principles for graph prompt learning 2. **Practical Significance**: Improves adaptability and generalization of pretrained graph models 3. **Research Inspiration**: Guides subsequent research, particularly highlighting the importance of input-level prompting ### Applicable Scenarios 1. **Few-Shot Learning**: Graph learning tasks with scarce annotated data 2. **Cross-Domain Transfer**: Scenarios where pretraining and downstream tasks have different distributions 3. **Heterophilic Graph Processing**: Graph data where traditional homophily assumptions do not hold 4. **Rapid Deployment**: Applications requiring quick deployment of pretrained models ## References The paper cites 91 relevant references covering multiple domains including graph neural networks, graph self-supervised learning, and graph prompt learning, providing a solid theoretical foundation for the research. --- **Summary**: Through in-depth theoretical analysis and extensive experimental validation, this paper provides important theoretical insights and practical methods for the graph prompt learning field. The UniPrompt method is simple, effective, and demonstrates good generality and adaptability, making valuable contributions to the development of graph foundation models.