2025-11-17T05:52:13.175509

Accelerated Evolving Set Processes for Local PageRank Computation

Huang, Luo, Xiao et al.

This work proposes a novel framework based on nested evolving set processes to accelerate Personalized PageRank (PPR) computation. At each stage of the process, we employ a localized inexact proximal point iteration to solve a simplified linear system. We show that the time complexity of such localized methods is upper bounded by $\min\{\tilde{\mathcal{O}}(R^2/Îµ^2), \tilde{\mathcal{O}}(m)\}$ to obtain an $Îµ$-approximation of the PPR vector, where $m$ denotes the number of edges in the graph and $R$ is a constant defined via nested evolving set processes. Furthermore, the algorithms induced by our framework require solving only $\tilde{\mathcal{O}}(1/\sqrtÎ±)$ such linear systems, where $Î±$ is the damping factor. When $1/Îµ^2\ll m$, this implies the existence of an algorithm that computes an $\ epsilon $-approximation of the PPR vector with an overall time complexity of $\tilde{\mathcal{O}}\left(R^2 / (\sqrtÎ±Îµ^2)\right)$, independent of the underlying graph size. Our result resolves an open conjecture from existing literature. Experimental results on real-world graphs validate the efficiency of our methods, demonstrating significant convergence in the early stages.

academic

Accelerated Evolving Set Processes for Local PageRank Computation

Basic Information

Paper ID: 2510.08010
Title: Accelerated Evolving Set Processes for Local PageRank Computation
Authors: Binbin Huang, Baojian Zhou, Luo Luo, Deqing Yang, Yanghua Xiao (Fudan University)
Category: cs.LG
Conference: 39th Conference on Neural Information Processing Systems (NeurIPS 2025)
Paper Link: https://arxiv.org/abs/2510.08010v2

Abstract

This paper proposes a novel framework based on nested evolving set processes to accelerate personalized PageRank (PPR) computation. At each stage, localized inexact proximal point iterations are employed to solve simplified linear systems. The research demonstrates that the time complexity upper bound for such localized methods is $\min\{\tilde{\mathcal{O}}(R^2/ε^2), \tilde{\mathcal{O}}(m)\}$ to obtain an ε-approximation of the PPR vector, where m denotes the number of edges in the graph and R is a constant defined through nested evolving set processes. The framework-induced algorithm requires solving only $\tilde{\mathcal{O}}(1/\sqrt{α})$ such linear systems, where α is the damping factor. When $1/ε^2 \ll m$ , this implies an algorithm that computes an ε-approximation of the PPR vector with total time complexity $\tilde{\mathcal{O}}(R^2/(\sqrt{α}ε^2))$ , independent of the underlying graph size.

Research Background and Motivation

Problem Definition

The personalized PageRank (PPR) vector π ∈ ℝⁿ is defined as:

(I - (1-α)(I + AD⁻¹)/2)π = αeₛ

where eₛ is the standard basis vector corresponding to source node s, α ∈ (0,1) is the damping factor, and A and D are the adjacency matrix and degree matrix of undirected graph G(V,E), respectively.

Research Motivation

Importance: PPR is a core tool for graph analysis, widely applied in local clustering, diffusion process modeling, graph neural networks, and other domains
Existing Limitations:
- Standard methods such as APPR have time complexity O(1/(αε))
- Accelerated methods face challenges where momentum terms destroy residual monotonicity
- Existing accelerated methods may access the entire graph, resulting in O(m/√α) time complexity
Open Question: Whether there exists a localized accelerated method with time complexity dependent on 1/√α rather than 1/α

Core Contributions

AESP Framework: Proposes the Accelerated Evolving Set Processes (AESP) framework, using $\tilde{O}(1/\sqrt{α})$ short evolving set processes rather than a single long process
Theoretical Guarantees: Establishes time complexity of $\tilde{O}(\text{vol}(S_t)/(\sqrt{α}γ_t))$ , matching the conjectured acceleration bounds in existing literature
Locality Guarantees: Proves an upper bound on $\text{vol}(S_t)/γ_t$ of $\min\{O(R^2/ε^2), 2m\}$ , achieving graph-size-independent complexity when $1/ε^2 \ll m$
Experimental Validation: Verifies method effectiveness on large-scale real graphs, demonstrating significant acceleration in early stages

Methodology Details

Task Definition

Given precision parameter ε, design a local algorithm to compute ε-approximation π̂ satisfying $\|D^{-1}(π̂ - π)\|_∞ ≤ ε$ , while avoiding access to the entire graph.

Core Idea: Nested Evolving Set Processes

1. Problem Reformulation

Reformulate the linear system as a strongly convex optimization problem:

min f(x) = (1/2)x⊤Qx - αx⊤D^(-1/2)b

where Q = ((1+α)/2)I - ((1-α)/2)D^(-1/2)AD^(-1/2), with eigenvalues λ(Q) ∈ α,1.

2. Nested ESP Definition

In outer loop iteration t, local solver M maintains a sequence of active sets {S_t^(k)}_{k≥0} on inner loop iterations k, with updates restricted to nodes in the active set.

3. AESP Update Rule

x^(t) = M(φ_t, y^(t-1), η, α, b, G)
y^(t) = x^(t) + β_t(x^(t) - x^(t-1))

where β_t is the momentum weight and M is the local operator.

Localized Inexact Proximal Operators

LOCGD (Local Gradient Descent)

z_t^(k+1) = z_t^(k) - (2∇h_t(z_t^(k)) ∘ 1_{S_t^k})/(1 + α + 2η)

Active node set: $S_t^k = \{u : |∇_u h_t^{-1/2}(z_t^{(k)})| ≥ ε_t\}$

LOCAPPR (Local APPR)

Coordinate-level updates for $u_i ∈ S_t^k$ :

z_t^(k_{i+1}) = z_t^(k_i) - (2∇h_t(z_t^{(k_i)}) ∘ 1_{u_i})/(1 + α + 2η)

Technical Innovations

Monotonicity Preservation: By solving regularized PPR linear systems with constant condition number, ensures monotonic decrease of ℓ₁ norm of D^{1/2}-scaled gradients
Locality Control: Introduces constant R to bound gradient ratios in nested ESP processes
Acceleration-Locality Balance: Achieves trade-off between condition number 1/α dependency and per-round time complexity O(R²/ε²)

Experimental Setup

Datasets

Experiments use 19 real-world graphs ranging from small to ultra-large scale:

Medium-scale: com-dblp (317K nodes, 1M edges)
Large-scale: ogb-mag240m (244M nodes, 1.7B edges), ogbn-papers100M (111M nodes, 1.6B edges)
Others: com-friendster, wiki-en21, etc.

Evaluation Metrics

Error Measure: $\log \|D^{-1}(π̂-π)\|_∞$
Efficiency Measure: Number of operations, running time
Acceleration Ratio: Performance improvement relative to baseline methods

Comparison Methods

APPR: Standard approximate personalized PageRank algorithm
APPR-opt: APPR with optimal step size
LOCGD: Local gradient descent
ASPR: Accelerated sparse personalized PageRank
FISTA: Fast iterative shrinkage-thresholding algorithm

Implementation Details

Damping factor: α = 0.01, 0.1
Precision threshold: ε = 10⁻⁶
Random selection of 5 source nodes for testing
Python implementation with numba acceleration

Experimental Results

Main Results

Large-scale Graph Performance

On four large-scale graphs (ogb-mag240m, ogbn-papers100M, com-friendster, wiki-en21):

AESP-LOCAPPR performs best, benefiting from online coordinate updates
Early-stage Acceleration: AESP methods achieve faster convergence than baseline methods in early stages
Operation Reduction: 2-10x fewer operations compared to APPR

α Dependency Analysis

When α is small, AESP significantly accelerates standard local methods:

Testing in range α ∈ 10⁻³, 10⁻¹
Acceleration ratio increases as α decreases, validating √α dependency
Both running time and operation count are significantly reduced

Initialization Strategy Comparison

Comparison of three initialization strategies for z_t^(0):

Cold Start: z_t^(0) = 0
Previous Estimate: z_t^(0) = x^(t-1)
Momentum Initialization: z_t^(0) = y^(t-1) (recommended)

Momentum initialization strategy performs best, requiring the fewest outer loop iterations.

Ablation Studies

Constant R Analysis

R remains a small constant (1.0-1.4) across 19 graphs
R is insensitive to graph size and condition number
Validates the reasonableness of bounded R assumption in theoretical analysis

vol(S_t)/γ_t Ratio Analysis

Experiments verify the theoretical bound $\text{vol}(S_t)/γ_t ≤ \min\{C_t^0/ε_t, 2m\}$ .

PPR Computation Methods

Iterative Methods: Conjugate gradient, Chebyshev methods, complexity O(m/√α)
Local Methods: APPR (O(1/(αε))), variational methods (O(1/((α+μ²)ε)))
Acceleration Attempts: FISTA, linear coupling destroy monotonicity, cannot guarantee locality

Evolving Set Processes

Concept proposed by Morris and Peres for analyzing mixing times
Zhou et al.'s localized Chebyshev method, but lacking convergence guarantees

Accelerated Optimization

Catalyst framework: inexact accelerated proximal point methods
This paper adapts it to local methods while preserving monotonicity

Conclusions and Discussion

Main Conclusions

Theoretical Breakthrough: First to achieve provably local acceleration for PPR computation, improving time complexity from O(1/α) to O(1/√α)
Practicality: When ε ≥ 1/√m, algorithm complexity is independent of graph size
Generality: Framework extends to variational forms and other related problems

Limitations

Constant R Bound: Cannot universally bound R by graph size or input parameters
ε² Dependency: When ε < 1/√m, local bound degrades to O(m/√α)
Theory-Practice Gap: Conservative estimates of ε_t may lead to suboptimal bounds

Future Directions

Improved Bounds: Explore whether conjectured O(1/(√αε)) complexity is achievable
Eliminate R Dependency: Remove constant R through constraints or adaptive restarts
Extended Applications: Apply techniques to bidirectional PPR, single-source PPR estimation, etc.

In-Depth Evaluation

Strengths

Significant Theoretical Contribution: Resolves open conjecture in literature, establishes rigorous theoretical guarantees
Strong Method Innovation: Cleverly combines Catalyst framework with local evolving set processes
Comprehensive Experiments: Validation on 19 graphs of varying scales, from small to ultra-large
Clear Presentation: Rigorous mathematical derivations, detailed algorithm descriptions

Weaknesses

Practical Limitations: Advantages not apparent when ε is very small; R bounding issue affects practical application
Implementation Complexity: Nested structure and parameter tuning increase implementation difficulty
Incomplete Comparisons: Lacks detailed comparison with latest acceleration methods

Impact

Academic Value: Provides new insights for graph algorithm acceleration, significant theoretical importance
Practical Value: Potential applications in large-scale graph analysis, recommendation systems, etc.
Reproducibility: Code publicly available, experimental setup detailed

Applicable Scenarios

Large-scale Graph Analysis: Especially when precision requirements are not extreme (ε ≥ 1/√m)
Real-time Recommendation Systems: Requires fast computation of local importance scores
Graph Neural Networks: As preprocessing step for computing node importance

References

This paper cites 52 relevant references covering multiple domains including PPR computation, accelerated optimization, and local algorithms, providing solid theoretical foundation for the research.

Overall Assessment: This is a high-quality paper balancing theory and practice, achieving significant progress on the important problem of PPR computation acceleration. Despite some theoretical limitations, its innovation and practicality make it an important contribution to the field.