2025-11-11T16:25:09.674123

Multi-Way Co-Ranking: Index-Space Partitioning of Sorted Sequences Without Merge

Joshi
We present a merge-free algorithm for multi-way co-ranking, the problem of computing cut indices $i_1,\dots,i_m$ that partition each of the $m$ sorted sequences such that all prefix segments together contain exactly $K$ elements. Our method extends two-list co-ranking to arbitrary $m$, maintaining per-sequence bounds that converge to a consistent global frontier without performing any multi-way merge or value-space search. Rather, we apply binary search to \emph{index-space}. The algorithm runs in $O(\log(\sum_t n_t)\,\log m)$ time and $O(m)$ space, independent of $K$. We prove correctness via an exchange argument and discuss applications to distributed fractional knapsack, parallel merge partitioning, and multi-stream joins. Keywords: Co-ranking \sep partitioning \sep Merge-free algorithms \sep Index-space optimization \sep Selection and merging \sep Data structures
academic

Multi-Way Co-Ranking: Index-Space Partitioning of Sorted Sequences Without Merge

Basic Information

  • Paper ID: 2510.22882
  • Title: Multi-Way Co-Ranking: Index-Space Partitioning of Sorted Sequences Without Merge
  • Author: Amit Joshi (Independent Researcher)
  • Classification: cs.DS (Data Structures and Algorithms)
  • Publication Date: October 27, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.22882

Abstract

This paper proposes a merge-free multi-way co-ranking algorithm for computing cutting indices i1,,imi_1,\dots,i_m that partition mm sorted sequences such that all prefix segments collectively contain exactly KK elements. The method extends Siebert and Träff's two-list co-ranking to arbitrary mm-way partitioning, maintaining boundaries for each sequence and converging to a consistent global frontier without performing multi-way merge or value-space search. The algorithm applies binary search in index space with time complexity O(log(tnt)logm)O(\log(\sum_t n_t)\log m) and space complexity O(m)O(m), independent of KK. Correctness is established through exchange arguments, and applications in distributed fractional knapsack, parallel merge partitioning, and multi-stream joins are discussed.

Research Background and Motivation

Problem Definition

The multi-way co-ranking problem is defined as follows: Given mm sequences L1,,LmL_1, \ldots, L_m sorted in non-decreasing order (duplicates allowed), each of length ntn_t, and a global target rank K{0,,N}K \in \{0, \ldots, N\} (where N=tntN = \sum_t n_t), find cutting indices i1,,imi_1, \ldots, i_m such that:

t=1mit=Kandmaxttmintrt\sum_{t=1}^m i_t = K \quad \text{and} \quad \max_t \ell_t \leq \min_t r_t

where t\ell_t and rtr_t denote the left and right boundary values, respectively.

Research Motivation

  1. Extending Classical Algorithms: Existing co-ranking algorithms primarily target two sequences, lacking efficient multi-way extensions
  2. Avoiding Merge Overhead: Traditional methods require merging multiple sequences before selection, incurring substantial overhead
  3. Index-Space Advantages: Operating in index space rather than value space avoids the complexity of value-domain search
  4. Practical Application Demands: Distributed computing, parallel processing, and database queries all require efficient multi-way partitioning algorithms

Limitations of Existing Methods

  • Siebert-Träff Method: Supports only two-sequence co-ranking
  • Frederickson-Johnson Method: Operates in value space, requiring global counting operations
  • Splitter-Based Methods: Require pre-merging or value-domain search, with higher complexity

Core Contributions

  1. Algorithm Design: Proposes the first merge-free multi-way co-ranking algorithm, extending the classical two-way method to arbitrary mm-way partitioning
  2. Theoretical Analysis: Proves algorithm correctness and O(log(tnt)logm)O(\log(\sum_t n_t)\log m) time complexity
  3. Data Structure Innovation: Designs indexed heaps (addressable heaps) to efficiently maintain boundary values
  4. Application Extension: Demonstrates algorithm applicability in distributed optimization, parallel processing, and database systems

Method Details

Task Definition

Input:

  • mm sorted sequences L1,,LmL_1, \ldots, L_m with lengths n1,,nmn_1, \ldots, n_m respectively
  • Target rank K[0,N]K \in [0, N], where N=t=1mntN = \sum_{t=1}^m n_t

Output:

  • Cutting index vector (i1,,im)(i_1, \ldots, i_m) satisfying co-ranking conditions

Constraints:

  • t=1mit=K\sum_{t=1}^m i_t = K
  • maxttmintrt\max_t \ell_t \leq \min_t r_t (co-ranking condition)

Algorithm Architecture

Core Data Structure: Index Heaps

The algorithm maintains two index heaps:

  • HLH_L: Max-heap storing left boundary values (t,t)(\ell_t, t), returning the sequence with maximum left boundary (donor)
  • HRH_R: Min-heap storing right boundary values (rt,t)(r_t, t), returning the sequence with minimum right boundary (receiver)

Each heap supports O(logm)O(\log m) update_key operations and O(1)O(1) peek operations.

Boundary Management

For each sequence tt, maintain:

  • Lower Bound: Lb[t]i[t]Lb[t] \leq i[t]
  • Upper Bound: i[t]Ub[t]i[t] \leq Ub[t]
  • Current Index: i[t]i[t]

Iterative Strategy

The algorithm employs a donor-receiver greedy strategy:

  1. Identify Extrema:
    • Donor p=argmaxttp = \arg\max_t \ell_t (maximum left boundary)
    • Receiver q=argmintrtq = \arg\min_t r_t (minimum right boundary)
  2. Compute Transfer Amount:
    donor_slack = ⌈(i[p] - Lb[p])/2⌉
    receiver_slack = ⌈(Ub[q] - i[q])/2⌉
    Δ = min{donor_slack, receiver_slack}
    
  3. Execute Transfer:
    • i[p]i[p]Δi[p] \leftarrow i[p] - \Delta
    • i[q]i[q]+Δi[q] \leftarrow i[q] + \Delta
    • Update bounds: Ub[p]i[p]Ub[p] \leftarrow i[p], Lb[q]i[q]Lb[q] \leftarrow i[q]
  4. Update Heaps: Refresh heap keys for affected sequences

Technical Innovations

  1. Index-Space Operations: Operates entirely in index space, avoiding value-domain search and merge operations
  2. Geometric Convergence: Guarantees logarithmic convergence speed through halving the feasible region
  3. Imbalance Potential Function: Defines Φ(i)=maxttmintrt\Phi(i) = \max_t \ell_t - \min_t r_t as convergence criterion
  4. Deterministic Complexity: Algorithm complexity is independent of target rank KK

Theoretical Analysis

Correctness Proof

Lemma 1 (Local Extrema Optimality)

If Φ(i)>0\Phi(i) > 0, let p=argmaxttp = \arg\max_t \ell_t and q=argmintrtq = \arg\min_t r_t. Among all feasible infinitesimal transfers maintaining tit=K\sum_t i_t = K, the (p,q)(p,q) pair achieves the maximum non-increasing change in Φ\Phi.

Proof Sketch: Decreasing ipi_p reduces p\ell_p (the local maximum of left boundaries), while increasing iqi_q increases rqr_q (the local minimum of right boundaries). Since pu\ell_p \geq \ell_u and rqrvr_q \leq r_v for all u,vu,v, the extremal pair (p,q)(p,q) produces the steepest reduction in the gap maxminr\max\ell - \min r.

Lemma 2 (Transfer Order Exchangeability)

Any sequence of feasible transfers reducing Φ\Phi can be reordered such that all extremal (p,q)(p,q) transfers occur before any non-extremal transfers, without worsening Φ\Phi at any intermediate step.

Theorem 1 (Convergence and Validity)

Algorithm 2 terminates with a valid co-ranking vector (i1,,im)(i_1, \ldots, i_m) satisfying tit=K\sum_t i_t = K and maxttmintrt\max_t \ell_t \leq \min_t r_t.

Complexity Analysis

Round Analysis

In each round, the feasible distance of either the donor or receiver is halved. The distance Ub[t]Lb[t]Ub[t] - Lb[t] for each sequence is reduced at most O(lognt)O(\log n_t) times. Aggregating across all mm sequences, the total number of rounds is:

T=O(log(t=1mnt))T = O\left(\log\left(\sum_{t=1}^m n_t\right)\right)

Time Complexity

Each round executes a constant number of index heap operations (O(logm)O(\log m) time), yielding total time complexity:

O(log(tnt)logm)O\left(\log\left(\sum_t n_t\right) \cdot \log m\right)

Space Complexity

The algorithm only needs to store indices and boundary information for mm sequences, with space complexity O(m)O(m).

Algorithm Implementation

Core Algorithm Flow

def multi_way_corank(sequences, K):
    m = len(sequences)
    # Initialize bounds and indices
    Lb = [0] * m
    Ub = [len(seq) for seq in sequences]
    i = water_fill_initialization(K, Ub)
    
    # Construct index heaps
    HL = MaxHeap()  # Max-heap for left boundaries
    HR = MinHeap()  # Min-heap for right boundaries
    
    for t in range(m):
        HL.insert(t, left_boundary(sequences[t], i[t]))
        HR.insert(t, right_boundary(sequences[t], i[t]))
    
    while True:
        # Obtain donor and receiver
        max_left, p = HL.peek()
        min_right, q = HR.peek()
        
        # Check termination condition
        if max_left <= min_right:
            break
            
        # Compute transfer amount
        donor_slack = ceil((i[p] - Lb[p]) / 2)
        receiver_slack = ceil((Ub[q] - i[q]) / 2)
        delta = min(donor_slack, receiver_slack)
        
        # Execute transfer
        i[p] -= delta
        i[q] += delta
        
        # Update bounds
        Ub[p] = i[p]
        Lb[q] = i[q]
        
        # Update heaps
        update_heaps(HL, HR, sequences, i, p, q)
    
    return i

Initialization Strategy

Employs a "water-filling" strategy for feasible solution initialization:

def water_fill_initialization(K, capacities):
    i = [0] * len(capacities)
    need = K
    for t in range(len(capacities)):
        take = min(capacities[t], need)
        i[t] = take
        need -= take
        if need == 0:
            break
    return i

Application Scenarios

1. Distributed Fractional Knapsack Problem

In multi-source fractional knapsack problems, when items are sorted by density and distributed across mm shards, co-ranking computes global KK-prefix partitioning without merging source data.

2. Parallel mm-Way Merge Partitioning

Assigns disjoint prefixes to processors without executing preliminary merge. The co-ranking vector determines exact handoff points, allowing processors to perform only local merging over their ranges.

3. Multi-Stream Join Partitioning

In database or stream processing pipelines, partitioning join frontiers at global rank is a natural requirement. This method produces per-stream cursors consistent with the global prefix.

Experimental Verification

While the paper primarily focuses on theoretical analysis, the author provides implementation code for verification. Algorithm practical performance can be evaluated through:

Theoretical Performance Guarantees

  • Time Complexity: O(log(tnt)logm)O(\log(\sum_t n_t) \log m)
  • Space Complexity: O(m)O(m)
  • Independence: Complexity is independent of target rank KK

Comparison with Existing Methods

  • vs. Merge Methods: Avoids O(N)O(N) merge overhead
  • vs. Value-Space Methods: Avoids global counting operations
  • vs. Frederickson-Johnson: Index-space operations provide greater efficiency

Two-List Co-Ranking

Siebert and Träff's work established the foundation for co-ranking, primarily used for work partitioning in parallel merge. This paper extends their approach from 2-way to arbitrary mm-way.

Splitter-Based Parallel Sorting

Siebert and Wolf's exact splitter method operates in value space, searching for key thresholds that balance partitions. In contrast, this paper operates in index space, directly outputting co-ranking vectors.

Selection in Sorted Partitions

Frederickson-Johnson's classical work studies selection and ranking on structured inputs (such as unions of mm sorted lists). Their algorithm is essentially a value-space process, while this paper provides an index-space primitive.

Conclusions and Discussion

Main Conclusions

  1. Successfully extends two-way co-ranking to multi-way partitioning while maintaining favorable theoretical properties
  2. Index-space operations avoid value-domain search, providing deterministic complexity guarantees
  3. Algorithm is simple to implement with good practical applicability

Limitations

  1. Assumptions: Requires input sequences to be pre-sorted
  2. Application Scope: Primarily applicable to scenarios requiring exact partitioning
  3. Experimental Validation: Lacks large-scale experimental verification of performance

Future Directions

  1. Dynamic Sequences: Extend to support dynamically updated sequences
  2. Approximation Algorithms: Develop faster approximate versions
  3. Parallelization: Investigate parallelization possibilities
  4. Practical Applications: Validate effectiveness in more real-world systems

In-Depth Evaluation

Strengths

  1. Theoretical Contribution: First efficient algorithm for multi-way co-ranking, filling a theoretical gap
  2. Method Innovation: Novel index-space operation approach avoids limitations of traditional methods
  3. Rigorous Analysis: Provides complete correctness proofs and complexity analysis
  4. Practical Value: Simple, easy-to-implement algorithm with clear application scenarios

Weaknesses

  1. Missing Experiments: Paper lacks experimental verification, preventing practical performance assessment
  2. Limited Comparisons: No detailed performance comparisons with existing methods
  3. Shallow Application Discussion: Application scenarios discussed relatively simply, lacking deep analysis

Impact

  1. Academic Value: Provides theoretical foundation for multi-way co-ranking problem
  2. Practical Potential: Shows application prospects in distributed computing and parallel processing
  3. Reproducibility: Author provides implementation code, facilitating verification and extension

Applicable Scenarios

  • Data partitioning in distributed systems
  • Load balancing in parallel algorithms
  • Query optimization in database systems
  • Multi-stream merging in stream processing systems

References

1 Greg N. Frederickson and Donald B. Johnson. Generalized selection and ranking. STOC 1980.

2 Christian Siebert. Perfectly load-balanced, stable, synchronization-free parallel merge. Parallel Processing Letters, 2014.

3 Christian Siebert. Simple in-place yet comparison-optimal mergesort, arXiv:2509.24540, 2025.

4 Christian Siebert and Felix Wolf. A scalable parallel sorting algorithm using exact splitting. RWTH Aachen University technical report, 2011.


Overall Assessment: This is a theoretically strong algorithmic paper that successfully addresses the important multi-way co-ranking problem. While lacking experimental verification, it features rigorous theoretical analysis and strong methodological innovation, providing valuable theoretical tools for related fields.