2025-11-12T12:22:13.745054

3SUM in Preprocessed Universes: Faster and Simpler

Kasliwal, Polak, Sharma
We revisit the 3SUM problem in the \emph{preprocessed universes} setting. We present an algorithm that, given three sets $A$, $B$, $C$ of $n$ integers, preprocesses them in quadratic time, so that given any subsets $A' \subseteq A$, $B' \subseteq B$, $C' \subseteq C$, it can decide if there exist $a \in A'$, $b \in B'$, $c \in C'$ with $a+b=c$ in time $O(n^{1.5} \log n)$. In contrast to both the first subquadratic $\tilde{O}(n^{13/7})$-time algorithm by Chan and Lewenstein (STOC 2015) and the current fastest $\tilde{O}(n^{11/6})$-time algorithm by Chan, Vassilevska Williams, and Xu (STOC 2023), which are based on the Balog--Szemerédi--Gowers theorem from additive combinatorics, our algorithm uses only standard 3SUM-related techniques, namely FFT and linear hashing modulo a prime. It is therefore not only faster but also simpler. Just as the two previous algorithms, ours not only decides if there is a single 3SUM solution but it actually determines for each $c \in C'$ if there is a solution containing it. We also modify the algorithm to still work in the scenario where the set $C$ is unknown at the time of preprocessing. Finally, even though the simplest version of our algorithm is randomized, we show how to make it deterministic losing only polylogarithmic factors in the running time.
academic

3SUM in Preprocessed Universes: Faster and Simpler

Basic Information

  • Paper ID: 2410.16784
  • Title: 3SUM in Preprocessed Universes: Faster and Simpler
  • Authors: Shashwat Kasliwal (IIT Delhi), Adam Polak (Bocconi University), Pratyush Sharma (IIT Delhi)
  • Classification: cs.DS (Data Structures and Algorithms)
  • Publication Time/Venue: TheoretiCS Volume 4 (2025), Article 24 (Invited article from SOSA 2025)
  • Paper Link: https://arxiv.org/abs/2410.16784

Abstract

This paper revisits the 3SUM problem in the preprocessed universe setting. It proposes an algorithm that, given three sets A, B, C each containing n integers, preprocesses them in quadratic time such that for any subsets A' ⊆ A, B' ⊆ B, C' ⊆ C, one can determine in O(n^1.5 log n) time whether there exist a ∈ A', b ∈ B', c ∈ C' satisfying a+b=c. Compared to Chan and Lewenstein's first subquadratic algorithm O(n^13/7) and Chan et al.'s current fastest O(n^11/6) algorithm (based on the Balog-Szemerédi-Gowers theorem from additive combinatorics), this algorithm employs only standard 3SUM techniques (FFT and linear hashing modulo primes), making it not only faster but also simpler.

Research Background and Motivation

Problem Background

The 3SUM problem is one of three core problems in fine-grained complexity theory (alongside APSP and SAT). Given three sets A, B, C each containing n integers, one must determine whether there exists a triple (a,b,c) ∈ A × B × C such that a + b = c. The standard textbook approach has time complexity O(n²), and the fastest known algorithm provides only log²n/(log log n)² improvement.

Research Motivation

  1. Complexity-theoretic significance: It is widely conjectured that no O(n^(2-ε)) algorithm exists for 3SUM, and many conditional lower bounds are based on this assumption
  2. Value of variant study: Investigating 3SUM variants that do admit strong subquadratic algorithms helps better understand 3SUM's complexity
  3. Practical considerations: The preprocessed universe variant has important practical value, enabling efficient processing of multiple queries

Limitations of Existing Methods

  • Chan-Lewenstein algorithm: Based on the complex Balog-Szemerédi-Gowers theorem, difficult to implement
  • Chan-Vassilevska Williams-Xu algorithm: Though faster, still relies on deep additive combinatorics theory
  • Both lack simplicity and have high practical implementation complexity

Core Contributions

  1. Improved algorithm efficiency: Proposes an algorithm with query time O(n^1.5 log n), significantly better than the previous O(n^11/6)
  2. Technical simplification: Uses only standard techniques (FFT and linear hashing), avoiding complex additive combinatorics tools
  3. Complete functionality: Not only determines existence of solutions but also identifies which c ∈ C' participates in some solution
  4. Scenario extension: Handles cases where set C is unknown at preprocessing time
  5. Deterministic version: Provides deterministic algorithm with only polylogarithmic loss

Detailed Methodology

Task Definition

Input: Three sets A, B, C each containing n integers Preprocessing: Preprocess these sets in O(n²) time Query: Given subsets A' ⊆ A, B' ⊆ B, C' ⊆ C, determine in O(n^1.5 log n) time for each c ∈ C' whether there exist a ∈ A', b ∈ B' such that a + b = c

Core Algorithm Architecture

Randomized Algorithm for Known Set C (Theorem 3.1)

Preprocessing Phase:

  1. Uniformly randomly select a prime p from the interval [n^1.5, 2n^1.5)
  2. Compute the false positive set: B := {(a,b,c) ∈ A × B × C : a + b ≡ c (mod p) ∧ a + b ≠ c}
  3. Expected number of false positives: E|B| ≤ O(n^1.5 log n)

Query Phase:

  1. Use FFT to compute the multiset (A' + B') mod p in O(p log p) time
  2. Create hash table H storing the count of modular p solutions for each c ∈ C'
  3. Traverse the false positive set B, subtracting false positives in the current instance
  4. For each c ∈ C', answer "yes" if Hc > 0, otherwise "no"

Algorithm for Unknown Set C (Theorem 4.1)

Preprocessing Phase:

  1. Compute the sumset A + B
  2. For each x ∈ A + B, store the witness set Wx := {(a,b) ∈ A × B : a + b = x}
  3. Select random prime m ∈ [n^1.5, 2·n^1.5)
  4. For each residue r ∈ m, prepare list Lr := {x ∈ A + B : x ≡ r (mod m)}

Query Phase:

  1. Use FFT to compute (A' + B') mod m
  2. For each c ∈ C':
    • Check whether c is in A + B
    • Use identity to compute the count of true solutions: modular m solution count minus false positive count
    • Compute false positives by traversing elements x ≠ c in Lc mod m

Technical Innovations

  1. Clever use of hashing: Selecting appropriately-sized prime moduli balances FFT efficiency and false positive count
  2. False positive control: Leverages Lemma 2.2 to bound expected false positive count at O(n^1.5 log n)
  3. FFT optimization: Transforms 3SUM into polynomial multiplication, exploiting FFT's O(m log m) complexity
  4. Deterministic conversion: Uses multi-modulus combination strategy for deterministic version

Experimental Setup

Theoretical Analysis Framework

This paper primarily conducts theoretical analysis using standard fine-grained complexity analysis methods:

Computational Model:

  • Standard word RAM with O(log n)-bit words
  • Input numbers bounded by n^O(1)
  • Arithmetic operations in constant time

Complexity Analysis:

  • Time complexity: O(n²) preprocessing, O(n^1.5 log n) query
  • Space complexity: O(n^1.5 log n) for known C version, O(n²) for unknown C version
  • Randomness: Las Vegas algorithm (expected time bounds)

Comparison Baselines

  1. Chan-Lewenstein (STOC 2015): O(n^13/7) query time
  2. Chan-Vassilevska Williams-Xu (STOC 2023): O(n^11/6) query time
  3. Standard 3SUM algorithm: O(n²) time (without preprocessing)

Experimental Results

Complexity Comparison Analysis

AlgorithmPreprocessing TimeQuery TimeSpace ComplexityDeterministic
Chan-LewensteinO(n²)O(n^13/7) ≈ O(n^1.857)O(n^13/7)Requires O(n^ω) preprocessing
Chan et al.O(n²)O(n^11/6) ≈ O(n^1.833)O(n² log n)Query time O(n^1.891)
This work (Known C)O(n²)O(n^1.5 log n)O(n^1.5 log n)Polylogarithmic loss
This work (Unknown C)O(n²)O(n^1.5 log n)O(n²)Theorem 5.1

Quantified Performance Improvements

  • Query time improvement: From O(n^11/6) ≈ O(n^1.833) to O(n^1.5), exponential reduction of approximately 0.333
  • Implementation complexity: Avoids Balog-Szemerédi-Gowers theorem, requires only FFT and hashing
  • Functional completeness: Maintains All-Numbers 3SUM capability

Algorithm Correctness

Proven through rigorous probabilistic analysis:

  • False positive expected bound: E|B| ≤ O(n^1.5 log n) (Lemma 2.2)
  • Las Vegas property: Restart mechanism ensures deterministic runtime bounds
  • Completeness: All true solutions are correctly identified

3SUM Research Timeline

  1. Classical 3SUM: Introduced by Gajentaan-Overmars, O(n²) standard algorithm
  2. Minor improvements: Baran-Demaine-Pătraşcu achieve polylog factor improvements
  3. Variant studies:
    • Small universe 3SUM: FFT method, O(n + U log U) time
    • 3SUM indexing: Different preprocessing paradigms
    • Real RAM version: Adaptation work by Fischer et al.

Preprocessed Universe Research

  • Bansal-Williams: First introduction of preprocessed universe concept
  • Chan-Lewenstein: First subquadratic algorithm, based on BSG theorem
  • Chan et al.: Current fastest algorithm, direct proof of BSG corollary

Technical Tools Development

  • FFT application in 3SUM: Classical method for small universe case
  • Linear hashing: Standard technique for false positive control
  • Derandomization techniques: Fischer-Kaliciak-Polak derandomization tools

Conclusions and Discussion

Main Conclusions

  1. Efficiency breakthrough: Achieves O(n^1.5 log n) query time, significantly better than previous best result
  2. Successful simplification: Avoids complex additive combinatorics, uses only basic tools
  3. Enhanced practicality: Provides deterministic version and handles unknown C scenario

Limitation Analysis

  1. Space complexity: Unknown C version requires O(n²) space to store complete sumset
  2. Model restrictions: Assumes bounded input numbers; practical applications may require additional handling
  3. Real RAM: Unclear whether adaptation to real RAM model is feasible
  4. Constant factors: Theoretical analysis does not account for practical implementation constants

Future Research Directions

  1. Real RAM adaptation: Explore feasibility in real RAM model
  2. Space optimization: Reduce space requirements for unknown C scenario
  3. Lower bound research: Investigate theoretical lower bounds for preprocessed universe 3SUM
  4. Practical implementation: Develop efficient practical algorithm implementations

In-Depth Evaluation

Strengths

  1. Significant theoretical contribution: Query time improved from O(n^1.833) to O(n^1.5), substantial improvement
  2. Clever technical innovations:
    • Prime selection strategy balances FFT efficiency and false positive control
    • Multi-modulus derandomization method has broad applicability
  3. High practical value: Algorithm is simple and easy to implement, avoids complex combinatorics tools
  4. Rigorous analysis: Probabilistic analysis and complexity proofs are complete and reliable
  5. Clear presentation: Technical descriptions are accurate and algorithm descriptions are easy to understand

Weaknesses

  1. Limited novelty: Primarily clever combination of existing techniques, relatively limited originality
  2. Missing experimental validation: Pure theoretical work lacking practical performance testing
  3. Limited applicability:
    • Bounded input number assumption may be too restrictive in practice
    • Real RAM adaptability unknown
  4. Space efficiency: O(n²) space requirement for unknown C version may limit practical applications

Impact Assessment

  1. Academic value: Provides new technical pathway for fine-grained complexity theory
  2. Practical potential: Simplified algorithm easier to deploy in practical systems
  3. Inspirational significance: Demonstrates that standard technique combinations can surpass complex theoretical tools
  4. Follow-up research: May inspire similar improvements for other geometric/combinatorial problems

Applicable Scenarios

  1. Database queries: Optimization of triplet queries on large datasets
  2. Computational geometry: Preprocessing acceleration for related geometric problems
  3. Cryptographic applications: Optimization of protocols based on 3SUM hardness
  4. Algorithm competitions: Efficient implementation in practical programming contests

References

The paper cites 16 related works, primarily including:

  • 3 Baran, Demaine, Pătraşcu: Classical 3SUM improvements
  • 5 Chan, Lewenstein: First subquadratic preprocessed universe algorithm
  • 6 Chan, Vassilevska Williams, Xu: Current fastest algorithm
  • 8 Fischer, Kaliciak, Polak: Deterministic 3SUM techniques
  • 16 Vassilevska Williams: Fine-grained complexity survey

Overall Assessment: This is a high-quality theoretical computer science paper achieving significant theoretical breakthrough on the preprocessed universe variant of the 3SUM problem. While technical innovations are relatively incremental, its contribution of simplifying complex algorithms while improving performance has important value. The paper's theoretical analysis is rigorous, presentation is clear, and it provides valuable new tools and insights for related fields.