2025-11-20T02:31:13.678138

Lossless Derandomization for Undirected Single-Source Shortest Paths and Approximate Distance Oracles

Yan

A common step in algorithms related to shortest paths in undirected graphs is that, we select a subset of vertices as centers, then grow a ball around each vertex until a center is reached. We want the balls to be as small as possible. A randomized algorithm can uniformly sample $r$ centers to achieve the optimal (expected) ball size of $Î(n/r)$. A folklore derandomization is to use the $O(\log n)$ approximation for the set cover problem in the hitting set version where we want to hit all the balls with the centers. However, the extra $O(\log n)$ factor is sometimes too expensive. For example, the recent $O(m\sqrt{\log n\log\log n})$ undirected single-source shortest path algorithm [DMSY23] beats Dijkstra's algorithm in sparse graphs, but the folklore derandomization would make it dominated by Dijkstra's. In this paper, we exploit the fact that the sizes of these balls can be adaptively chosen by the algorithm instead of fixed by the input. We propose a simple deterministic algorithm achieving the optimal ball size of $Î(n/r)$ on average. Furthermore, given any polynomially large cost function of the ball size, we can still achieve the optimal cost on average. It allows us to derandomize [DMSY23], resulting in a deterministic $O(m\sqrt{\log n\log\log n})$ algorithm for undirected single-source shortest path. In addition, we show that the same technique can also be used to derandomize the seminal Thorup-Zwick approximate distance oracle [TZ05], also without any loss in the time/space complexity.

academic

Lossless Derandomization for Undirected Single-Source Shortest Paths and Approximate Distance Oracles

Basic Information

Paper ID: 2510.12598
Title: Lossless Derandomization for Undirected Single-Source Shortest Paths and Approximate Distance Oracles
Author: Shuyi Yan (BARC, University of Copenhagen)
Classification: cs.DS (Data Structures and Algorithms)
Publication Date: October 14, 2025 (arXiv preprint)
Paper Link: https://arxiv.org/abs/2510.12598

Abstract

This paper investigates a fundamental problem in shortest path algorithms on undirected graphs: how to select a subset of vertices as centers and grow balls from each vertex until reaching a center, with the goal of minimizing the total ball size. While randomized algorithms can achieve optimal expected ball size Θ(n/r) by uniformly sampling r centers, traditional derandomization methods introduce an additional O(log n) factor, which is prohibitively expensive in certain applications. By exploiting the fact that ball sizes can be adaptively chosen by the algorithm, this paper proposes a simple deterministic algorithm that achieves optimal ball size Θ(n/r) in expectation and extends to arbitrary polynomial-sized cost functions. The technique successfully derandomizes the O(m√(log n log log n)) single-source shortest path algorithm of DMSY23 and the classical Thorup-Zwick approximate distance oracle without loss in time/space complexity.

Research Background and Motivation

Core Problem

The core problem addressed in this paper is the Hitting Growable Balls problem. Many graph algorithms, particularly those for shortest paths, distance oracles, and sparse subgraph construction, encounter the following pattern:

Select a subset R of vertices as centers
Grow a ball B(v) from each vertex v until reaching some center
Algorithm performance depends on |R| and the total size of all balls

Problem Significance

This problem holds fundamental importance in graph algorithms, affecting the efficiency of multiple critical algorithms:

Single-Source Shortest Paths: The DMSY23 algorithm first breaks the O(m + n log n) barrier of Dijkstra's algorithm on sparse graphs
Distance Oracles: The Thorup-Zwick oracle is a classical data structure for approximate distance queries
Graph Sparsification: Similar techniques are employed in constructing sparse subgraphs

Limitations of Existing Methods

Traditional derandomization approaches have critical shortcomings:

Cost of Folk Methods: Using the O(log n)-approximation algorithm for set cover to derandomize introduces an additional O(log n) factor
Severe Performance Loss: For the DMSY23 algorithm, this additional factor degrades the time complexity from O(m√(log n log log n)) to O(m log n √(log log n)), being surpassed again by Dijkstra's algorithm
Fixed Ball Size Assumption: Traditional methods assume ball sizes are fixed by the input, failing to exploit algorithmic adaptivity

Research Motivation

The key insight of this paper is: ball sizes can be adaptively chosen by the algorithm rather than being fixed by the input. This provides new possibilities for designing superior derandomization algorithms.

Core Contributions

Proposes a deterministic algorithm for hitting growable balls: Designs Algorithm 1 that achieves optimal ball size Θ(n/r) in expectation without introducing additional logarithmic factors
Achieves lossless derandomization of DMSY23: Transforms the randomized O(m√(log n log log n)) single-source shortest path algorithm into a deterministic algorithm with identical complexity
Derandomizes the Thorup-Zwick distance oracle: Removes the O(log n) factor from previous derandomization methods, achieving O(kn^(1/k)(m + n log n)) preprocessing time identical to the randomized version
Provides a general theoretical framework: The method extends to arbitrary polynomial-sized cost functions with broad applicability

Methodology Details

Problem Formulation

The Hitting Growable Balls problem is formally defined as follows:

Input: n vertices, m initially empty balls, parameter r ∈ 1,n, cost function f(·)
Operations: The algorithm can select a ball and request the adversary to add a new vertex to it
Objective: Select r vertices as centers such that every ball is hit (contains at least one center), minimizing total cost ∑f(|Bi|)

Core Algorithm Architecture

Algorithm 1 is the central contribution of this paper:

Algorithm 1: Hitting Growable Balls
Input: number of vertices n, number of balls m, target number of centers r, cost parameter p
Output: at most r centers hitting all balls

1: Initialize all balls as empty, no centers selected
2: b ← ⌈2^(p+2)n/r⌉
3: Grow each ball to size min{b, n}
4: while not all balls are hit do
5:   m' ← number of unhit balls
6:   Repeatedly select new centers hitting the most unhit balls,
     until number of unhit balls ≤ m'/2^(p+1)
7:   b ← 2b
8:   Grow each unhit ball to size min{b, n}
9: return all selected centers

Algorithm Core Ideas

Exponential Decreasing Strategy: In round i, use only O(r/2^i) centers, but allow ball sizes to grow to 2^i n/r
Balanced Trade-off: Although later rounds have larger balls, the number of unhit balls decreases exponentially
Adaptive Growth: Dynamically adjust ball sizes based on current unhit ball status

Theoretical Analysis

Theorem 4 proves the algorithm's correctness:

Number of Centers: Selects at most r centers
Total Cost: O_p(m(n/r)^p) = O_p(m·f(n/r))
Time Complexity: O_p(r + mn/r)

Key Proof Points:

Analysis of center count bounds in each round
Exponential decrease of unhit ball counts
Obtaining total cost bounds through geometric series summation

Experimental Setup

Theoretical Verification

This is primarily a theoretical work, with correctness and complexity bounds verified through rigorous mathematical proofs.

Application Verification

The method's effectiveness is verified through two concrete applications:

Single-Source Shortest Paths:
- Integrates Algorithm 1 into the bundle construction phase of DMSY23
- Cost function set to f(x) = x log x
- Parameter choice r = m√(log log n)/√(log n)
Thorup-Zwick Distance Oracle:
- Applies Algorithm 1 at each level i to select A_{i+1}
- Combines with RTZ05 techniques for efficient ball growth operations
- Parameter setting r = n^(-1/k)|A_i|

Implementation Details

Data Structure Optimization:

Maintains count a_j of unhit balls that vertex j can hit
Uses doubly-linked lists L_k to maintain vertices with a_j = k
Supports O(1) insertion, deletion, and maximum finding operations

Experimental Results

Main Results

Theorem 2 (Single-Source Shortest Paths):

In undirected graphs with non-negative edge weights, there exists a deterministic comparison-addition algorithm
Time Complexity: O(m√(log n log log n))
Identical to the randomized DMSY23 algorithm complexity

Theorem 3 (Thorup-Zwick Oracle):

Thorup-Zwick approximate distance oracle of size O(kn^{1+1/k})
Can be deterministically constructed in O(kn^{1/k}(m + n log n)) time
Identical preprocessing time to the original randomized algorithm

Complexity Comparison

Algorithm	Type	Time Complexity	Remarks
Dijkstra	Deterministic	O(m + n log n)	Classical algorithm
DMSY23	Randomized	O(m√(log n log log n))	First to break Dijkstra
DMSY23 + Folk Derandomization	Deterministic	O(m log n √(log log n))	Surpassed by Dijkstra
This Paper	Deterministic	O(m√(log n log log n))	Lossless derandomization

Technical Innovation Verification

Corollary 5 demonstrates the method's applicability to different cost functions:

For f(x) = x log x, through application of Jensen's inequality
Total cost bound: O(m(n/r)log(n/r))
Extends to other polynomial-sized cost functions

Single-Source Shortest Paths

Classical Algorithms: Dijkstra's algorithm and its improvements
Integer Weights: Thorup's linear-time algorithm
Recent Breakthroughs: DMSY23's randomized algorithm, DMM+25's deterministic algorithm

Approximate Distance Oracles

Foundational Work: Thorup-Zwick oracle established the foundation
Derandomization Research: RTZ05 proposed improved derandomization methods
Query Optimization: Query time improvements by Wulff-Nilsen et al.

Derandomization Techniques

Traditional Methods: Folk derandomization based on set cover
Problem Limitations: Additional O(log n) factor unacceptable in certain applications
This Paper's Contribution: First true lossless derandomization

Conclusions and Discussion

Main Conclusions

Theoretical Breakthrough: First lossless derandomization of the ball-hitting problem, achieving optimal bounds in expectation
Practical Applications: Successfully applied to two important graph algorithms while maintaining complexity identical to randomized versions
General Framework: Provides a general method applicable to various cost functions

Limitations

Model Assumptions:
- Requires graph degree to be O(m/n) (achievable through vertex splitting)
- Requires comparison-addition model
- For DMSY23 application, assumes constant-degree graphs
Applicable Scope:
- Primarily applicable to scenarios where ball sizes can be adaptively chosen
- Inapplicable when ball sizes are completely fixed by input
Practical Efficiency:
- While asymptotically optimal, hidden constants may be large
- May underperform simple randomized methods on small-scale graphs

Future Directions

Algorithm Optimization:
- Reduce constant factors in the algorithm
- Design more practical implementation versions
Application Extensions:
- Explore applications in other graph algorithms
- Study extensions to dynamic graph settings
Theoretical Deepening:
- Research lower bounds to prove optimality
- Extend to more general cost functions

In-Depth Evaluation

Strengths

Technical Innovation:
- First to exploit growable ball properties for lossless derandomization
- Algorithm design is elegant and clever, with creative exponential decreasing strategy
Theoretical Contributions:
- Rigorous mathematical proofs with complete theoretical analysis
- Provides a general framework applicable to various cost functions
Practical Significance:
- Solves derandomization problems for important algorithms like DMSY23
- Improvements to Thorup-Zwick oracle have fundamental value
Clear Presentation:
- Well-structured paper with accurate technical descriptions
- Concise and understandable algorithm pseudocode

Weaknesses

Lack of Experimental Verification:
- Pure theoretical work without practical performance testing
- No empirical comparison with alternative methods
Constant Factor Analysis:
- While asymptotically optimal, hidden constants may be substantial
- Practical efficiency in real applications remains unverified
Limited Application Scope:
- Primarily targets specific problem types
- Limited guidance for general derandomization problems

Impact

Academic Value:
- Provides new insights for derandomization techniques
- May inspire improvements to other algorithms
Practical Value:
- Provides important tools for applications requiring deterministic guarantees
- Potential applications in distributed and parallel computing
Reproducibility:
- Clear algorithm description, easy to implement
- Theoretical results independently verifiable

Applicable Scenarios

Theoretical Research: Reference for derandomizing other randomized algorithms
System Applications: Replace randomized algorithms in systems requiring deterministic behavior
Educational Use: Excellent case study for derandomization techniques

References

The paper cites extensive related work, primarily including:

DMSY23: Duan et al.'s randomized single-source shortest path algorithm
TZ05: Original work on Thorup-Zwick approximate distance oracle
RTZ05: Roditty et al.'s derandomization improvements
Dij59: Dijkstra's classical shortest path algorithm
FT87: Related work on Fibonacci heaps

This paper makes important contributions to theoretical computer science, particularly in derandomization of graph algorithms. While lacking experimental verification, its theoretical value and potential applications make it a significant advance in the field.