2025-11-24T07:31:17.262721

Minimizing Vertical Length in Linked Bar Charts

Broek, van Kreveld, Meulemans et al.
A linked bar chart is the augmentation of a traditional bar chart where each bar is partitioned into blocks and pairs of blocks are linked using orthogonal lines that pass over intermediate bars. The order of the blocks readily influences the legibility of the links. We study the algorithmic problem of minimizing the vertical length of these links, for a fixed bar order. The main challenge lies with ``dependent'' links, whose vertical link length cannot be optimized independently per bar. We show that, if the dependent links form a forest, the problem can be solved in $O(nm)$ time, for n bars and m links. If the dependent links between non-adjacent bars form a forest, the problem admits an $O(n^4m)$-time algorithm. Finally, we show that the general case is fixed-parameter tractable in the maximum number of links that are connected to one bar.
academic

Minimizing Vertical Length in Linked Bar Charts

Basic Information

  • Paper ID: 2511.16812
  • Title: Minimizing Vertical Length in Linked Bar Charts
  • Authors: Steven van den Broek (TU Eindhoven), Marc van Kreveld (Utrecht University), Wouter Meulemans (TU Eindhoven), Arjen Simons (TU Eindhoven)
  • Classification: cs.CG (Computational Geometry)
  • Submission Date/Conference: Submitted to arXiv on November 20, 2025; research initiated at 2024 AGA Workshop
  • Paper Link: https://arxiv.org/abs/2511.16812

Abstract

Linked bar charts are an enhanced version of traditional bar charts where each bar is subdivided into multiple blocks, with connections between blocks represented by orthogonal lines that traverse intermediate bars. The ordering of blocks directly affects the readability of connection lines. This paper investigates the algorithmic problem of minimizing the vertical length of connection lines under fixed bar ordering. The primary challenge involves "dependent links," whose vertical lengths cannot be optimized independently. The research demonstrates that: when dependent links form a forest structure, the problem is solvable in O(nm) time (n bars, m links); when dependent links between non-adjacent bars form a forest, it is solvable in O(n⁴m) time; in the general case, the problem is fixed-parameter tractable (FPT) with respect to the maximum degree of bars.

Research Background and Motivation

1. Problem Statement

Traditional bar charts can only represent single-category data. However, in many practical scenarios, certain numerical values are not uniquely attributed to a single category but are related to multiple categories. Examples include:

  • Shared quantities: Communication volume between accounts appearing in statistics for both accounts
  • Pairwise uncertainty: Voters in election polls hesitating between two parties; border factory contributions to pollution in both countries

2. Problem Significance

Visualization of cross-category data is an important requirement in data analysis. Linked bar charts represent such relationships by adding connection lines between bars, but the stacking order of blocks directly affects the vertical length of connection lines, thereby impacting chart readability and aesthetics.

3. Limitations of Existing Methods

  • Standard bar charts cannot directly visualize cross-category values
  • Although linked bar charts were recently proposed 17, the optimization problem for block stacking order has not been studied
  • Traditional graph drawing problems (such as single-page book embedding) only consider vertex ordering, not the new dimension of block stacking

4. Research Motivation

This paper is the first to systematically investigate the algorithmic problem of minimizing vertical link length in linked bar charts, providing theoretical foundations and efficient algorithms for this novel visualization method.

Core Contributions

  1. Problem Formalization: First defines the optimization problem of minimizing vertical link length in linked bar charts, introducing a classification system of "independent links" and "dependent links"
  2. Forest Structure Algorithm: Proves that when the dependent link subgraph forms a forest, the problem can be solved in O(nm) time via dynamic programming (Theorem 3)
  3. Non-Adjacent Forest Algorithm: Proves that when non-adjacent dependent links form a forest, the problem can be solved in O(n⁴m) time (Theorem 6)
  4. FPT Algorithm: Proves that in the general case, the problem is fixed-parameter tractable with respect to the maximum degree Δ of bars, with time complexity O(Δ^(3δ)·δ·n) (Theorem 8)
  5. Complexity Lower Bounds: Proves that when vertices can have multiple unlinked blocks with arbitrary ordering, the problem is strongly NP-hard (Theorem 12)

Methodology Details

Task Definition

Input: Weighted graph G = (V, E, w), where:

  • V: Fixed sequence of n vertices v₁, ..., vₙ
  • E ⊆ V²: m edges
  • w: V ∪ E → ℝ⁺: Weight function (vertex weights = single-category values, edge weights = cross-category values)

Output: Stacking order of blocks within each bar such that:

  • Total vertical link length is minimized
  • Links sharing endpoints do not cross

Constraints:

  • Horizontal ordering of bars is fixed
  • Unlinked blocks are placed at the bottom of bars
  • Links must traverse all intermediate bars

Core Concepts

The paper divides links into two major categories:

Independent Links (IL): Vertical length can be independently optimized by placing blocks at a fixed target t with cost |t - y| + |t - y'|. Three cases:

  1. Some intermediate bar is higher than the highest possible position of some endpoint → target is the height of the highest intermediate bar H
  2. Possible position intervals of two endpoints do not overlap internally → target is the lowest position of the higher interval
  3. Some endpoint position is fixed (corresponding sequence is empty) → target is that fixed position

Dependent Links (DL): Vertical length depends on the relative positions of two blocks, cannot be assigned a static target. Further subdivided into:

  • Adjacent Dependent Links (ADL): Connect adjacent bars
  • Non-Adjacent Dependent Links (NADL): Connect non-adjacent bars

Key Lemma 1: The dependent link subgraph is outerplanar, hence has treewidth ≤ 2.

2. Position Representation and Cost Calculation

For block b in bar Bⱼ (corresponding to left-edge lᵢ ∈ Lⱼ), its vertical center coordinate is:

y(b, k) = ½h(b) + w(vⱼ) + Σ(x=1 to i-1)h(lₓ) + Σ(x=1 to k)h(rₓ)

where k represents the number of right-edges below it.

Cost Function:

  • Independent block b at position i: λ(b, i) = |t - y(b, i)|
  • Dependent link e at position (i,j):
    λ(e, i, j) = {
      |H - y(b,i)| + |H - y(b',j)|  if both endpoints are below H
      |y(b,i) - y(b',j)|            otherwise
    }
    

Algorithm Design

Core Idea: Utilize tree structure to determine bar processing order, compute via bottom-up dynamic programming.

Algorithm Steps:

  1. Arbitrarily root each tree T in the forest
  2. For each bar B, let lₚ* be the block connecting to parent node (parent link)
  3. Define dynamic programming table:
    • P(B, k): Minimum cost of subtree Tᵦ with k right-edges below parent link block
    • D↓(p, q): Minimum cost of first p left-edges and q right-edges
    • D↑(p, q): Minimum cost of subsequent blocks
  4. Decomposition Strategy: Parent link block lₚ* at position k divides the bar into two independent parts:
    P(B, k) = D↓(p*-1, k) + D↑(p*+1, k+1)
    
  5. Recursive Computation of D↓:
    D↓(p, q) = min{
      D↓(p-1, q) + cost(lₚ, q),
      D↓(p, q-1) + cost(rᵧ, p)
    }
    

    For dependent blocks, cost is: minⱼ λ(e, i, j) + P(B', j)

Time Complexity Analysis:

  • Computing D table for bar with degree d requires O(d²)
  • Precomputing dependent block costs: O(d·d')
  • Total time: O(Σd² + Σd·d') = O(nm)

Extension Challenge: Allow adjacent dependent links (ADL), requiring handling of more complex dependencies.

Key Techniques:

  1. Extended forest F: Contains all NADL and maximal ADL set (non-cyclic)
  2. Enhanced state representation: P*(B, k, l, r), parameterized by three possible single-endpoint dependent links
  3. Handling ADL:
    • Let a←,1, a←,2, ... be left ADL, a→,1, a→,2, ... be right ADL
    • Dynamic programming tables D↓(p, q, x, y) and D↑(p, q, x, y) track ADL positions

Recurrence Formula (when lₚ is a dependent block):

D↓(p, q, x, y) = min over χ of [
  min over x'(D↓(p-1, q, x', y) + λ(a←,i, χ, x')) +
  min over k'(P(B', k', x, χ) + λ(lₚ, k', q))
]

Time Complexity:

  • Per (p,q) pair: O(Δ³) precomputation + O(Δ³) computation of D
  • Total O(d²) pairs, O(Δ³d²) per bar
  • Computing P: O(Δ⁴d)
  • Total time: O(n⁴m)

Algorithm 3: General Case FPT Algorithm (O(Δ^(3δ)·δ·n) Time)

Core Technique: Tree Decomposition

Algorithm Framework:

  1. Construct nice tree decomposition (T, X, r) of dependent link subgraph G'
    • Treewidth ≤ 2 (outerplanar property)
    • O(n) nodes, each bag contains at most 3 bars
    • Constructible in O(n) time
  2. Define state: P*(u, S₁, S₂, S₃)
    • u: Node in tree decomposition
    • Sᵢ: State of bar Bᵢ in bag (position of each dependent link)
    • Represents minimum cost of all blocks and links in "past" of u
  3. State Count (Lemma 9):
    • Per bar: O(Δ^δ) states (injective functions from δ dependent links to Δ positions)
    • Per node: O(Δ^(3δ)) states
  4. Processing by Node Type:
    • Leaf node: P*(u) = 0, O(1) time
    • Join node: P*(u, ...) = P*(v, ...) + P*(w, ...), O(Δ^(3δ)·δ) time
    • Introduce node: Directly inherit from child, O(Δ^(3δ)·δ) time
    • Forget node: Most complex, handles independent blocks and dependent links of forgotten bar
  5. Forget Node Processing (Lemma 11):
    P*(u, S₁, S₂) = min over S∈Sf [
      P*(v, S₁, S₂, S) + IL(S) + DL(S, S₁) + DL(S, S₂)
    ]
    
    • Precompute Dᵢ,ⱼ(p, q): Minimum cost of independent block subset, O(Δ³) time
    • Per state: O(Δ^δ·δ) time
    • Total: O(Δ^(3δ)·δ) time

Time Complexity: O(n) nodes × O(Δ^(3δ)·δ) = O(Δ^(3δ)·δ·n)

Corollary:

  • When Δ = O(1), algorithm runs in polynomial time
  • When only δ = O(1) (dependent link degree bounded), algorithm still runs in polynomial time O(n)

Technical Innovations

  1. Link Classification System: Independent/dependent link classification enables problem decomposition; independent links optimize locally, dependent links require global consideration
  2. Hierarchical Dynamic Programming:
    • Algorithm 1: Exploits tree structure decomposition
    • Algorithm 2: Introduces parameterized states for ADL handling
    • Algorithm 3: Utilizes tree decomposition for general case
  3. Outerplanar Graph Property Exploitation: Outerplanarity of dependent link subgraph guarantees treewidth ≤ 2, providing theoretical foundation for FPT algorithm
  4. Precomputation Strategy: Precompute independent block subset costs at forget nodes, avoiding redundant computation

Experimental Setup

Note: This is a theoretical algorithms paper without experimental validation section. The paper focuses on algorithm design and complexity analysis.

Complexity Proofs

The paper proves problem hardness through reduction:

Theorem 12 (Strong NP-Hardness): When vertices can have multiple unlinked blocks with arbitrary ordering, minimizing vertical link length is strongly NP-hard.

Proof Method: Reduction from 3-Partition problem

  • Construction: 2k-1 bars, B₀ has n unlinked blocks (corresponding to 3-Partition instance)
  • Key property: Only B₀'s unlinked block ordering is adjustable
  • Equivalence: Zero vertical length scheme exists ⟺ 3-Partition solution exists

Results

This is a pure theory paper without experimental results section. Main results are:

Algorithm Complexity Summary

Dependent Link StructureTime ComplexityTheorem
No dependent linksO(nm)Lemma 5
Forest structureO(nm)Theorem 3
Non-adjacent forestO(n⁴m)Theorem 6
General caseO(Δ^(3δ)·δ·n)Theorem 8
Multiple unlinked blocksStrongly NP-hardTheorem 12

Theoretical Findings

  1. Structural Dependency: Problem complexity strongly depends on dependent link graph structure
  2. Degree Parameterization: FPT-solvable in general case; polynomial time when dependent degree is bounded
  3. Bar Height Structure:
    • Single local maximum → dependent links form path collection
    • Single local minimum → non-adjacent dependent links form forest

1. Graph Drawing

  • Single-Page Book Embedding: Outerplanar graphs are exactly those drawable without crossings 1
  • Traditional Optimization Targets:
    • Crossing minimization 5,13,14
    • Edge length minimization 15,16
    • Bend minimization 2,10,13,15
  • This Paper's Contribution: First to consider block stacking order as new dimension

2. Visualization Quality Metrics

  • Storyline Visualization: Considers vertical distance 9
  • Parallel Coordinates: Screen-space metrics 7
  • This Paper's Extension: Applies vertical distance metrics to linked bar charts

3. Graph Optimization Problems

  • Outerplanar Graphs: Minimizing total/maximum edge length, cutwidth, bandwidth solvable in polynomial time 11
  • General Graphs: These problems are NP-hard 12
  • This Paper's Position: Between polynomial and NP-hard, analyzed via parameterized complexity

4. Linked Bar Charts

  • Original Proposal: van Beusekom et al. 17 for visualizing dependent and independent uncertainties
  • This Paper's Contribution: First systematic study of algorithmic optimization problems

Conclusions and Discussion

Main Conclusions

  1. Hierarchical Complexity: Problem complexity ranges from O(nm) (forest structure) to FPT (general case) to strongly NP-hard (extended version)
  2. Practical Algorithms: For common structured data (e.g., unimodal/unimodal height distributions), efficient polynomial algorithms exist
  3. Parameterized Tractability: In general case, problem is efficiently solvable when bar degree is bounded

Limitations

  1. Fixed Bar Ordering: Algorithms assume bar ordering is predetermined; joint optimization not considered
  2. Theoretical Gaps: General case exact complexity (P vs NP) remains undetermined
  3. Missing Experimental Validation: No actual implementation or performance evaluation provided
  4. Instance Structure Requirements: FPT algorithm may be impractical on high-degree instances

Future Directions

The paper explicitly proposes the following research directions:

  1. Complexity Determination: Determine whether general case is NP-hard under fixed bar ordering
  2. Joint Optimization: Simultaneously optimize bar ordering and block stacking order
  3. Extended Designs:
    • Asymmetric Links: Different block heights at endpoints
    • Alternative Metrics: Bend minimization, etc.
    • Directed Graphs and Multigraphs: Ordering and grouping of multiple links
    • Hypergraphs: Links among three or more bars
  4. Practical Applications: Evaluate algorithm performance on real datasets

In-Depth Evaluation

Strengths

  1. Theoretical Rigor:
    • Complete complexity hierarchy (polynomial to FPT to NP-hard)
    • Rigorous mathematical proofs and time complexity analysis
    • Clear problem formalization
  2. Clever Algorithm Design:
    • Independent/dependent link classification provides effective problem decomposition
    • Three algorithms employ different techniques (forest structure, tree decomposition)
    • Sophisticated dynamic programming design fully exploits problem structure
  3. Result Completeness:
    • Covers multiple special cases and general case
    • Provides upper bounds (algorithms) and lower bounds (NP-hardness)
    • Parameterized analysis provides fine-grained complexity characterization
  4. Writing Clarity:
    • Clear concept definitions (link types, past, etc.)
    • Illustrative figures (Figures 3-8)
    • Progressive algorithm presentation

Weaknesses

  1. Missing Practical Validation:
    • No real data experiments
    • Actual efficiency of FPT algorithm for large Δ unknown
    • No comparison with heuristic algorithms
  2. General Case Complexity Gap:
    • Whether fixed bar ordering is NP-hard remains undetermined
    • Reduction proof depends on multiple unlinked blocks assumption, differs from original problem
  3. High Algorithm Complexity:
    • Algorithm 2's O(n⁴m) may be impractical for large instances
    • Algorithm 3's exponential dependence on Δ^(3δ) explodes for large degree
  4. Problem Scope Limitations:
    • Only considers vertical length as single objective
    • Does not address crossing minimization and other important quality metrics
    • Fixed bar ordering assumption is restrictive

Impact

  1. Theoretical Contribution:
    • Pioneering study of linked bar chart algorithmic problems
    • Opens new research direction for visualization optimization
    • Demonstrates parameterized complexity application in visualization
  2. Practical Value:
    • Provides theoretical guidance for practical systems
    • Special case algorithms directly applicable
    • Provides benchmark for heuristic algorithm design
  3. Reproducibility:
    • Detailed algorithm descriptions
    • Complete mathematical derivations
    • Lacks code implementation

Applicable Scenarios

  1. Direct Application:
    • Data with unimodal/unimodal bar height distribution (Algorithms 1, 2)
    • Sparse graphs with small bar degree (Algorithm 3)
    • Scenarios with simple dependent link structure
  2. Requiring Extensions:
    • Large-scale dense graphs (need heuristic algorithms)
    • Scenarios requiring joint bar ordering optimization
    • Multi-objective optimization (length + crossing + bending)
  3. Theoretical Guidance:
    • Visualization system design
    • Data preprocessing (e.g., sorting bars to form forest structure)
    • Approximation algorithm evaluation benchmark

Key References

1 Bernhart & Kainen (1979): The book thickness of a graph - Foundational theory of single-page book embedding

6 Cygan et al. (2015): Parameterized algorithms - Standard textbook for FPT algorithms

11 Frederickson & Hambrusch (1988): Planar linear arrangements of outerplanar graphs - Polynomial algorithms for outerplanar graph optimization

17 van Beusekom et al. (2024): Visualizing set sizes with dependent and independent uncertainties - Original proposal of linked bar charts


Overall Assessment: This is a high-quality theoretical computational geometry paper that systematically investigates the algorithmic problem of linked bar chart optimization. The paper is theoretically rigorous, features clever algorithm design, and provides complete complexity analysis. Its main value lies in establishing solid theoretical foundations for this novel visualization method. Shortcomings include missing experimental validation and incomplete complexity characterization for the general case. Future integration of real data evaluation and heuristic algorithm design would significantly enhance its practical value.