2025-11-20T02:31:13.678138

Lossless Derandomization for Undirected Single-Source Shortest Paths and Approximate Distance Oracles

Yan
A common step in algorithms related to shortest paths in undirected graphs is that, we select a subset of vertices as centers, then grow a ball around each vertex until a center is reached. We want the balls to be as small as possible. A randomized algorithm can uniformly sample $r$ centers to achieve the optimal (expected) ball size of $Θ(n/r)$. A folklore derandomization is to use the $O(\log n)$ approximation for the set cover problem in the hitting set version where we want to hit all the balls with the centers. However, the extra $O(\log n)$ factor is sometimes too expensive. For example, the recent $O(m\sqrt{\log n\log\log n})$ undirected single-source shortest path algorithm [DMSY23] beats Dijkstra's algorithm in sparse graphs, but the folklore derandomization would make it dominated by Dijkstra's. In this paper, we exploit the fact that the sizes of these balls can be adaptively chosen by the algorithm instead of fixed by the input. We propose a simple deterministic algorithm achieving the optimal ball size of $Θ(n/r)$ on average. Furthermore, given any polynomially large cost function of the ball size, we can still achieve the optimal cost on average. It allows us to derandomize [DMSY23], resulting in a deterministic $O(m\sqrt{\log n\log\log n})$ algorithm for undirected single-source shortest path. In addition, we show that the same technique can also be used to derandomize the seminal Thorup-Zwick approximate distance oracle [TZ05], also without any loss in the time/space complexity.
academic

Lossless Derandomization for Undirected Single-Source Shortest Paths and Approximate Distance Oracles

Basic Information

  • Paper ID: 2510.12598
  • Title: Lossless Derandomization for Undirected Single-Source Shortest Paths and Approximate Distance Oracles
  • Author: Shuyi Yan (BARC, University of Copenhagen)
  • Classification: cs.DS (Data Structures and Algorithms)
  • Publication Date: October 14, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.12598

Abstract

This paper investigates a fundamental problem in shortest path algorithms on undirected graphs: how to select a subset of vertices as centers and grow balls from each vertex until reaching a center, with the goal of minimizing the total ball size. While randomized algorithms can achieve optimal expected ball size Θ(n/r) by uniformly sampling r centers, traditional derandomization methods introduce an additional O(log n) factor, which is prohibitively expensive in certain applications. By exploiting the fact that ball sizes can be adaptively chosen by the algorithm, this paper proposes a simple deterministic algorithm that achieves optimal ball size Θ(n/r) in expectation and extends to arbitrary polynomial-sized cost functions. The technique successfully derandomizes the O(m√(log n log log n)) single-source shortest path algorithm of DMSY23 and the classical Thorup-Zwick approximate distance oracle without loss in time/space complexity.

Research Background and Motivation

Core Problem

The core problem addressed in this paper is the Hitting Growable Balls problem. Many graph algorithms, particularly those for shortest paths, distance oracles, and sparse subgraph construction, encounter the following pattern:

  1. Select a subset R of vertices as centers
  2. Grow a ball B(v) from each vertex v until reaching some center
  3. Algorithm performance depends on |R| and the total size of all balls

Problem Significance

This problem holds fundamental importance in graph algorithms, affecting the efficiency of multiple critical algorithms:

  • Single-Source Shortest Paths: The DMSY23 algorithm first breaks the O(m + n log n) barrier of Dijkstra's algorithm on sparse graphs
  • Distance Oracles: The Thorup-Zwick oracle is a classical data structure for approximate distance queries
  • Graph Sparsification: Similar techniques are employed in constructing sparse subgraphs

Limitations of Existing Methods

Traditional derandomization approaches have critical shortcomings:

  1. Cost of Folk Methods: Using the O(log n)-approximation algorithm for set cover to derandomize introduces an additional O(log n) factor
  2. Severe Performance Loss: For the DMSY23 algorithm, this additional factor degrades the time complexity from O(m√(log n log log n)) to O(m log n √(log log n)), being surpassed again by Dijkstra's algorithm
  3. Fixed Ball Size Assumption: Traditional methods assume ball sizes are fixed by the input, failing to exploit algorithmic adaptivity

Research Motivation

The key insight of this paper is: ball sizes can be adaptively chosen by the algorithm rather than being fixed by the input. This provides new possibilities for designing superior derandomization algorithms.

Core Contributions

  1. Proposes a deterministic algorithm for hitting growable balls: Designs Algorithm 1 that achieves optimal ball size Θ(n/r) in expectation without introducing additional logarithmic factors
  2. Achieves lossless derandomization of DMSY23: Transforms the randomized O(m√(log n log log n)) single-source shortest path algorithm into a deterministic algorithm with identical complexity
  3. Derandomizes the Thorup-Zwick distance oracle: Removes the O(log n) factor from previous derandomization methods, achieving O(kn^(1/k)(m + n log n)) preprocessing time identical to the randomized version
  4. Provides a general theoretical framework: The method extends to arbitrary polynomial-sized cost functions with broad applicability

Methodology Details

Problem Formulation

The Hitting Growable Balls problem is formally defined as follows:

  • Input: n vertices, m initially empty balls, parameter r ∈ 1,n, cost function f(·)
  • Operations: The algorithm can select a ball and request the adversary to add a new vertex to it
  • Objective: Select r vertices as centers such that every ball is hit (contains at least one center), minimizing total cost ∑f(|Bi|)

Core Algorithm Architecture

Algorithm 1 is the central contribution of this paper:

Algorithm 1: Hitting Growable Balls
Input: number of vertices n, number of balls m, target number of centers r, cost parameter p
Output: at most r centers hitting all balls

1: Initialize all balls as empty, no centers selected
2: b ← ⌈2^(p+2)n/r⌉
3: Grow each ball to size min{b, n}
4: while not all balls are hit do
5:   m' ← number of unhit balls
6:   Repeatedly select new centers hitting the most unhit balls,
     until number of unhit balls ≤ m'/2^(p+1)
7:   b ← 2b
8:   Grow each unhit ball to size min{b, n}
9: return all selected centers

Algorithm Core Ideas

  1. Exponential Decreasing Strategy: In round i, use only O(r/2^i) centers, but allow ball sizes to grow to 2^i n/r
  2. Balanced Trade-off: Although later rounds have larger balls, the number of unhit balls decreases exponentially
  3. Adaptive Growth: Dynamically adjust ball sizes based on current unhit ball status

Theoretical Analysis

Theorem 4 proves the algorithm's correctness:

  • Number of Centers: Selects at most r centers
  • Total Cost: O_p(m(n/r)^p) = O_p(m·f(n/r))
  • Time Complexity: O_p(r + mn/r)

Key Proof Points:

  1. Analysis of center count bounds in each round
  2. Exponential decrease of unhit ball counts
  3. Obtaining total cost bounds through geometric series summation

Experimental Setup

Theoretical Verification

This is primarily a theoretical work, with correctness and complexity bounds verified through rigorous mathematical proofs.

Application Verification

The method's effectiveness is verified through two concrete applications:

  1. Single-Source Shortest Paths:
    • Integrates Algorithm 1 into the bundle construction phase of DMSY23
    • Cost function set to f(x) = x log x
    • Parameter choice r = m√(log log n)/√(log n)
  2. Thorup-Zwick Distance Oracle:
    • Applies Algorithm 1 at each level i to select A_{i+1}
    • Combines with RTZ05 techniques for efficient ball growth operations
    • Parameter setting r = n^(-1/k)|A_i|

Implementation Details

Data Structure Optimization:

  • Maintains count a_j of unhit balls that vertex j can hit
  • Uses doubly-linked lists L_k to maintain vertices with a_j = k
  • Supports O(1) insertion, deletion, and maximum finding operations

Experimental Results

Main Results

Theorem 2 (Single-Source Shortest Paths):

  • In undirected graphs with non-negative edge weights, there exists a deterministic comparison-addition algorithm
  • Time Complexity: O(m√(log n log log n))
  • Identical to the randomized DMSY23 algorithm complexity

Theorem 3 (Thorup-Zwick Oracle):

  • Thorup-Zwick approximate distance oracle of size O(kn^{1+1/k})
  • Can be deterministically constructed in O(kn^{1/k}(m + n log n)) time
  • Identical preprocessing time to the original randomized algorithm

Complexity Comparison

AlgorithmTypeTime ComplexityRemarks
DijkstraDeterministicO(m + n log n)Classical algorithm
DMSY23RandomizedO(m√(log n log log n))First to break Dijkstra
DMSY23 + Folk DerandomizationDeterministicO(m log n √(log log n))Surpassed by Dijkstra
This PaperDeterministicO(m√(log n log log n))Lossless derandomization

Technical Innovation Verification

Corollary 5 demonstrates the method's applicability to different cost functions:

  • For f(x) = x log x, through application of Jensen's inequality
  • Total cost bound: O(m(n/r)log(n/r))
  • Extends to other polynomial-sized cost functions

Single-Source Shortest Paths

  1. Classical Algorithms: Dijkstra's algorithm and its improvements
  2. Integer Weights: Thorup's linear-time algorithm
  3. Recent Breakthroughs: DMSY23's randomized algorithm, DMM+25's deterministic algorithm

Approximate Distance Oracles

  1. Foundational Work: Thorup-Zwick oracle established the foundation
  2. Derandomization Research: RTZ05 proposed improved derandomization methods
  3. Query Optimization: Query time improvements by Wulff-Nilsen et al.

Derandomization Techniques

  1. Traditional Methods: Folk derandomization based on set cover
  2. Problem Limitations: Additional O(log n) factor unacceptable in certain applications
  3. This Paper's Contribution: First true lossless derandomization

Conclusions and Discussion

Main Conclusions

  1. Theoretical Breakthrough: First lossless derandomization of the ball-hitting problem, achieving optimal bounds in expectation
  2. Practical Applications: Successfully applied to two important graph algorithms while maintaining complexity identical to randomized versions
  3. General Framework: Provides a general method applicable to various cost functions

Limitations

  1. Model Assumptions:
    • Requires graph degree to be O(m/n) (achievable through vertex splitting)
    • Requires comparison-addition model
    • For DMSY23 application, assumes constant-degree graphs
  2. Applicable Scope:
    • Primarily applicable to scenarios where ball sizes can be adaptively chosen
    • Inapplicable when ball sizes are completely fixed by input
  3. Practical Efficiency:
    • While asymptotically optimal, hidden constants may be large
    • May underperform simple randomized methods on small-scale graphs

Future Directions

  1. Algorithm Optimization:
    • Reduce constant factors in the algorithm
    • Design more practical implementation versions
  2. Application Extensions:
    • Explore applications in other graph algorithms
    • Study extensions to dynamic graph settings
  3. Theoretical Deepening:
    • Research lower bounds to prove optimality
    • Extend to more general cost functions

In-Depth Evaluation

Strengths

  1. Technical Innovation:
    • First to exploit growable ball properties for lossless derandomization
    • Algorithm design is elegant and clever, with creative exponential decreasing strategy
  2. Theoretical Contributions:
    • Rigorous mathematical proofs with complete theoretical analysis
    • Provides a general framework applicable to various cost functions
  3. Practical Significance:
    • Solves derandomization problems for important algorithms like DMSY23
    • Improvements to Thorup-Zwick oracle have fundamental value
  4. Clear Presentation:
    • Well-structured paper with accurate technical descriptions
    • Concise and understandable algorithm pseudocode

Weaknesses

  1. Lack of Experimental Verification:
    • Pure theoretical work without practical performance testing
    • No empirical comparison with alternative methods
  2. Constant Factor Analysis:
    • While asymptotically optimal, hidden constants may be substantial
    • Practical efficiency in real applications remains unverified
  3. Limited Application Scope:
    • Primarily targets specific problem types
    • Limited guidance for general derandomization problems

Impact

  1. Academic Value:
    • Provides new insights for derandomization techniques
    • May inspire improvements to other algorithms
  2. Practical Value:
    • Provides important tools for applications requiring deterministic guarantees
    • Potential applications in distributed and parallel computing
  3. Reproducibility:
    • Clear algorithm description, easy to implement
    • Theoretical results independently verifiable

Applicable Scenarios

  1. Theoretical Research: Reference for derandomizing other randomized algorithms
  2. System Applications: Replace randomized algorithms in systems requiring deterministic behavior
  3. Educational Use: Excellent case study for derandomization techniques

References

The paper cites extensive related work, primarily including:

  1. DMSY23: Duan et al.'s randomized single-source shortest path algorithm
  2. TZ05: Original work on Thorup-Zwick approximate distance oracle
  3. RTZ05: Roditty et al.'s derandomization improvements
  4. Dij59: Dijkstra's classical shortest path algorithm
  5. FT87: Related work on Fibonacci heaps

This paper makes important contributions to theoretical computer science, particularly in derandomization of graph algorithms. While lacking experimental verification, its theoretical value and potential applications make it a significant advance in the field.