2025-11-25T13:49:17.496621

PHast -- Perfect Hashing made fast

Beling, Sanders

Perfect hash functions give unique "names" to arbitrary keys requiring only a few bits per key. This is an essential building block in applications like static hash tables, databases, or bioinformatics. This paper introduces the PHast approach that combines the fastest available queries, very fast construction, and good space consumption (below 2 bits per key). PHast improves bucket-placement which first hashes each key k to a bucket, and then looks for the bucket seed s such that a placement function maps pairs (s,k) in a collision-free way. PHast can use small-range hash functions with linear mapping, fixed-width encoding of seeds, and parallel construction. This is achieved using small overlapping slices of allowed values and bumping to handle unsuccessful seed assignment. A variant we called PHast+ uses additive placement, which enables bit-parallel seed searching, speeding up the construction by an order of magnitude.

academic

PHast -- Perfect Hashing made fast

Basic Information

Paper ID: 2504.17918
Title: PHast -- Perfect Hashing made fast
Authors: Piotr Beling (University of Łódź, Poland), Peter Sanders (Karlsruhe Institute of Technology, Germany)
Classification: cs.DS cs.DB cs.PF
Publication Date: October 22, 2025 (arXiv version)
Paper Link: https://arxiv.org/abs/2504.17918

Abstract

Perfect hash functions assign unique "names" to arbitrary keys requiring only a few bits per key. This is an essential building block in applications such as static hash tables, databases, and bioinformatics. This paper introduces the PHast approach that combines the fastest available queries, very fast construction, and good space consumption (below 2 bits per key). PHast improves bucket-placement by first hashing each key k to a bucket, then searching for the bucket seed s such that a placement function maps pairs (s,k) in a collision-free manner. PHast can use small-range hash functions with linear mapping, fixed-width encoding of seeds, and parallel construction. This is achieved using small overlapping slices of allowed values and bumping to handle unsuccessful seed assignment. A variant called PHast+ uses additive placement, which enables bit-parallel seed searching, accelerating construction by an order of magnitude.

Research Background and Motivation

Problem Definition

A Perfect Hash Function (PHF) is an injective function that maps a set of keys {k₁, ..., kₙ} to {0, ..., m-1}. When m = n, it is called a Minimal Perfect Hash Function (MPHF). This is an important building block for applications in databases, text indexing, and bioinformatics.

Research Motivation

Multi-objective Optimization Challenge: Designing MPHF involves multi-objective optimization of space consumption, construction time, and query time
Theoretical Lower Bound: The theoretical lower bound for MPHF is log₂ e ≈ 1.44 bits per key
Practical Requirements: In practice, PHF is commonly used to accelerate static data structures, making fast queries critical

Limitations of Existing Methods

Bucket Placement Methods: CHD, FCH, PTHash, PHOBIC, etc. require non-linear bucket allocation functions or variable-length seed encoding, affecting query speed
Recursive Splitting Methods: While space-efficient, query time is slower and requires decoding variable-length navigation information
Fingerprinting Methods: Require at least e bits per key, and queries need rank operations on bit vectors
Parallel Construction Overhead: Existing methods' parallel construction requires additional query steps to retrieve offset values

Core Contributions

Linear Bucket Mapping: Achieves cache-friendly multi-threaded construction through linear mapping to small overlapping slices, avoiding non-linear bucket allocation
Bumping Mechanism: Enables use of small-range hash functions and fixed-width seed encoding, avoiding complex local search
Heuristic Seed Assignment: Reduces space consumption by selecting seeds that occupy the minimum function values
PHast+ Variant: Uses additive placement function to enable bit-parallel seed searching, achieving an order of magnitude speedup in construction
Comprehensive Experimental Evaluation: Detailed performance comparison with existing methods

Method Details

Task Definition

Given a set of n keys, construct a perfect hash function f such that:

f is injective (collision-free)
Query time is as fast as possible
Construction time is reasonable
Space consumption is low (target < 2 bits per key)

Core Algorithm Architecture

Map-or-Bump Function

PHast is based on the "map-or-bump" method, mapping n keys to {0, 1, ..., m-1}:

f(k) = {
  ⌊αh(k)⌋ + p(s, h(k))  if s > 0
  fallback_for_bumped(k)  else
}

Where:

h(k) is a u-bit hash function (u = 64)
s = seeds⌊βh(k)⌋ is the seed
α, β are scaling factors
p(s, h(k)) is the placement function

Key Technical Components

Linear Bucket Allocation:
- Bucket index: ⌊B/2ᵘ · cᵢ⌋
- Slice start value: ⌊(m-L+1)/2ᵘ · cᵢ⌋
- Output value: ⌊(m-L+1)/2ᵘ · cᵢ⌋ + p(s, cᵢ)
Windowed Seed Assignment:
- Uses sliding window of size W = 256
- Priority queue manages unseeded buckets
- Priority function: ℓ(s) - 1024b (s is bucket size, b is bucket index)
Bumping Mechanism:
- Buckets for which no seed can be found are marked as bumped (seed value = 0)
- Approximately 1% of buckets are bumped, with minimal impact on expected query time

PHast+ Optimization

PHast+ uses additive placement function for fast construction:

p(s, c) = c mod L + s - 1        // Formula 3.2
p(s, c) = (c + δs) mod L         // Formula 3.3

Bit-Parallel Seed Searching:

Simultaneously tests feasibility of 64 consecutive seeds
Uses bitwise operations for fast collision detection
Achieves approximately 10x speedup in construction

Parallel Construction

Communication-Free Parallelization:
- Seed array divided into t blocks and t-1 gaps
- Threads compute seeds in blocks in parallel
- Gap size: ⌈LB/(m-L+1)⌉ buckets
Cache-Friendly Design:
- Processes buckets in index order
- Uses circular buffer for occupancy bitmap
- Sequential memory access pattern

Experimental Setup

Experimental Environment

Hardware: AMD Ryzen 5600G @3.9GHz (6 cores, 12 threads)
Cache: 384KB/3MB/16MB L1/L2/L3
Compiler: Rust 1.84.1, GCC 12.2.0
Thread Count: 12 threads (for multi-threaded tests)

Datasets

Random Integer Keys: 5×10⁷ and 5×10⁸ 64-bit random integers
Random String Keys: Random strings of length 10-50 bytes
Hash Function: GxHash (high-performance SIMD hashing)

Comparison Methods

Bucket Placement: PTHash, PHOBIC, PtrHash
Recursive Splitting: SIMDRecSplit, Bipartite ShockHash
Fingerprinting Methods: FiPS, FMPH, FMPHGO
Static Function Retrieval: SicHash

Evaluation Metrics

Space Consumption: bits per key
Query Time: nanoseconds per query
Construction Time: nanoseconds per key
Parallel Speedup: single-thread vs multi-thread performance ratio

Experimental Results

Main Performance Characteristics

Query Performance (5×10⁷ keys)

PHast: 9-22 ns/query, 1.82-2.33 bits/key space
PHast+: 10-15 ns/query, 1.84-2.24 bits/key space
PtrHash: 14-17 ns/query, 2.12-2.99 bits/key space
PTHash: 40-54 ns/query, 2.11-3.19 bits/key space

Construction Performance (5×10⁷ keys, single-threaded)

PHast+: 61-140 ns/key (optimal configuration)
PHast: 133-5322 ns/key (seed size dependent)
PtrHash: 75-192 ns/key
FMPH: 40-57 ns/key (but larger space)

Parallel Speedup

PHast: 8.5-9.6x speedup (efficient seed assignment parallelization)
PHast+: 4.1-5.4x speedup
PtrHash: 3.6-5.1x speedup

Parameter Optimization Results

Optimal Configuration (Space Minimization)

Seed Size S	PHast λ	Space (bits/key)	PHast+ λ	Space (bits/key)
8	4.7	1.91	5.35	2.09
10	6.05	1.85	6.35	1.90
12	7.35	1.81	7.4	1.82

Ablation Studies

Heuristic Seed Selection: Removing it increases space from 1.92 to 2.39 bits/key
Window Size: W=1 increases space to 2.20 bits/key, W>256 shows no significant improvement
Slice Constraints: Removing slice constraints significantly increases space consumption
Bucket Processing Order: Processing by decreasing size (as in CHD) results in larger space consumption

Evolution of Bucket Placement Methods

CHD: Linear bucket allocation but slow construction, requires variable-length seed encoding
FCH/PTHash: Simple non-linear bucket allocation, partially mitigates issues
PHOBIC: Optimal bucket allocation function, but slower queries
PtrHash: Query-optimized PHOBIC variant using local search

Other Method Categories

Fingerprinting Methods: Fast construction but large space, queries require rank operations
Recursive Splitting: Space close to theoretical lower bound but slow queries
Static Function Retrieval: Requires complex multi-position probing

PHast's Uniqueness

PHast avoids variable-length encoding and complex local search through the bumping mechanism while maintaining the simplicity of linear bucket allocation.

Conclusions and Discussion

Main Conclusions

Query Performance: PHast achieves near-theoretically-optimal query time
Construction Efficiency: PHast+ provides extremely fast construction speed
Space Efficiency: Achieves good space consumption while maintaining fast queries
Parallelization-Friendly: Parallel construction without additional query overhead

Limitations

Space vs Recursive Methods: Still not as close to theoretical lower bound as recursive splitting methods
Large Datasets: For 5×10⁸ keys, memory access becomes a bottleneck
Parameter Tuning: Requires selecting appropriate parameter combinations for different application scenarios

Future Directions

External Memory Construction: Implement external memory algorithms outlined in Section 6
Batch Queries: Support batch query optimizations similar to PtrHash
GPU Acceleration: Explore GPU parallelization possibilities
k-Perfect Extension: Support cases where k keys can map to the same value

In-Depth Evaluation

Strengths

Technical Innovation

Simplified Design Philosophy: Avoids complex mechanisms through bumping, embodying "less is more"
Linear Mapping: Restores simplicity of linear bucket allocation while solving its problems
Bit-Parallel Optimization: PHast+'s bit-parallel seed searching is an important engineering innovation
Cache-Friendly Design: Sequential processing and circular buffer design consider modern CPU characteristics

Experimental Rigor

Comprehensive Comparison: Detailed performance comparison covering all major method categories
Multi-Dimensional Evaluation: Trade-off analysis of space, query time, and construction time
Parameter Study: Detailed parameter tuning and ablation experiments
Reproducibility: Open-source implementation and detailed experimental setup

Weaknesses

Method Limitations

Space Overhead: Still approximately 0.4 bits/key gap compared to recursive methods
Parameter Sensitivity: Requires adjusting multiple parameters based on key count and seed size
Theoretical Analysis: Lacks rigorous theoretical proof of space complexity

Experimental Gaps

Limited Datasets: Primarily uses random data, lacks testing on real application data
Memory Hierarchy: Insufficient analysis of effects at different cache levels
Long-term Stability: No testing of performance under large-scale long-term usage

Impact Assessment

Academic Contribution

Theoretical Value: Demonstrates competitiveness of simple methods under engineering optimization
Methodology: Provides "simplification rather than complexification" thinking for data structure design
Benchmark Establishment: Sets new standards for perfect hashing query performance

Practical Value

Direct Application: Applicable to databases, search engines, and other scenarios requiring fast queries
Engineering Reference: Cache-friendly and parallelization designs can be borrowed for other data structures
Open-Source Contribution: Rust implementation provides high-quality tools for the community

Applicable Scenarios

Ideal Applications

Static Hash Tables: Scenarios with fixed key sets and frequent queries
Database Indexing: Database systems requiring fast key-value lookups
Bioinformatics: Applications like k-mer indexing requiring massive queries
Caching Systems: In-memory caches requiring extremely fast query response

Constraints

Sufficient Memory: Requires adequate memory to store complete data structure
Static Data: Not suitable for scenarios with frequent updates
Query-Intensive: Best suited for applications where query frequency far exceeds construction frequency

References

PHOBIC: Hermann et al. "Perfect hashing with optimized bucket sizes and interleaved coding"
PtrHash: Groot Koerkamp "PtrHash: Minimal Perfect Hashing at RAM Throughput"
PTHash: Pibiri & Trani "PTHash: Revisiting FCH minimal perfect hashing"
SIMDRecSplit: Bez et al. "High performance construction of RecSplit based minimal perfect hash functions"

Theoretical Foundations

Fredman & Komlós: Theoretical lower bounds for perfect hash functions
Belazzougui et al.: Foundational work on bucket placement methods

The PHast paper demonstrates that in algorithmic engineering, simple methods carefully optimized through deep understanding of problem essence and modern hardware characteristics can achieve or even surpass the performance of complex methods. This provides important insights for data structure design: sometimes the key to solving problems is not adding complexity, but finding the correct direction for simplification.