Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases
Abdullah, Zaman
Modern cloud databases present scaling as a binary decision: scale-out by adding nodes or scale-up by increasing per-node resources. This one-dimensional view is limiting because database performance, cost, and coordination overhead emerge from the joint interaction of horizontal elasticity and per-node CPU, memory, network bandwidth, and storage IOPS. As a result, systems often overreact to load spikes, underreact to memory pressure, or oscillate between suboptimal states. We introduce the Scaling Plane, a two-dimensional model in which each distributed database configuration is represented as a point (H, V), with H denoting node count and V a vector of resources. Over this plane, we define smooth approximations of latency, throughput, coordination overhead, and monetary cost, providing a unified view of performance trade-offs. We show analytically and empirically that optimal scaling trajectories frequently lie along diagonal paths: sequences of joint horizontal and vertical adjustments that simultaneously exploit cluster parallelism and per-node improvements. To compute such actions, we propose DIAGONALSCALE, a discrete local-search algorithm that evaluates horizontal, vertical, and diagonal moves in the Scaling Plane and selects the configuration minimizing a multi-objective function subject to SLA constraints. Using synthetic surfaces, microbenchmarks, and experiments on distributed SQL and KV systems, we demonstrate that diagonal scaling reduces p95 latency by up to 40 percent, lowers cost-per-query by up to 37 percent, and reduces rebalancing by 2 to 5 times compared to horizontal-only and vertical-only autoscaling. Our results highlight the need for multi-dimensional scaling models and provide a foundation for next-generation autoscaling in cloud database systems.
academic
Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases
Modern cloud databases view scaling as a binary decision: horizontal scaling (scale-out) by adding nodes or vertical scaling (scale-up) by increasing single-node resources. This one-dimensional perspective has limitations because database performance, cost, and coordination overhead stem from joint interactions between horizontal elasticity and single-node CPU, memory, network bandwidth, and storage IOPS. Consequently, systems often overreact to load peaks, underreact to memory pressure, or oscillate between suboptimal states.
This paper introduces the Scaling Plane, a two-dimensional model where each distributed database configuration is represented as a point (H, V), with H denoting the number of nodes and V as a resource vector. On this plane, the authors define smooth approximations for latency, throughput, coordination overhead, and monetary cost, providing a unified view of performance trade-offs. The research demonstrates that optimal scaling trajectories typically follow diagonal paths: coordinated horizontal-vertical adjustment sequences that simultaneously leverage cluster parallelism and single-node improvements. To this end, the authors propose the DIAGONALSCALE algorithm, a discrete local search algorithm that evaluates horizontal, vertical, and diagonal movements in the scaling plane and selects configurations minimizing a multi-objective function under SLA constraints.
Experiments show that diagonal scaling reduces p95 latency by up to 40% compared to pure horizontal or pure vertical autoscaling, reduces cost-per-query by up to 37%, and decreases rebalancing by 2-5×.
The scaling decision dilemma faced by modern distributed databases:
Limitations of binary choice: Traditional approaches treat horizontal scaling (adding nodes) and vertical scaling (adding resources) as independent decisions
System behavior issues: Improper reactions to load changes, leading to over-provisioning, SLA violations, or frequent destructive rebalancing
Lack of unified view: No comprehensive model to understand multi-dimensional interactions between performance, cost, and coordination overhead
The authors observe that many workloads benefit from coordinated rather than independent horizontal and vertical resource adjustments. This motivates them to construct a unified multi-dimensional scaling model and develop algorithms capable of optimization in this space.
Scaling Plane Model: Proposes a novel two-dimensional abstraction for elastic database configurations, representing configurations as (H, V) points, where H is the number of nodes and V is a resource vector
Analytical Performance Surfaces: Derives closed-form models for latency, throughput, cost, and coordination overhead, revealing the geometric structure of these metrics on the H-V plane
DIAGONALSCALE Algorithm: Designs a discrete optimization algorithm that evaluates local neighborhoods in the H-V plane, supporting horizontal, vertical, and diagonal movements
Theoretical Guarantees: Provides proof sketches for monotonic improvement, convergence to local optimality, and stability
Comprehensive Evaluation: Demonstrates diagonal scaling advantages through synthetic surfaces, microbenchmarks, and distributed SQL/KV system experiments:
Key finding: The minimum of F rarely occurs on axis-aligned edges (pure scale-up or pure scale-out). Instead, the minimum lies in the interior, along a diagonal trajectory.
If L decreases in V and increases in H, C increases in both, and K increases in H but decreases in V, then F has at least one interior stationary point (H*, V*).
Direction diversity: Check horizontal, vertical, and diagonal movements
Stability: Penalize disruptive movements based on expected rebalancing
Monotonicity: Accept movements only if F improvement exceeds margin ε
Neighborhood definition:
N(H,V) = {(H±ΔH, V), (H, V±1), (H±ΔH, V±1)}
ΔH typically 1-2 nodes; vertical movements correspond to adjacent instance types.
Algorithm Flow (Algorithm 1):
Input: Current configuration (H,V), SLA (Lmax, Tmin)
Output: Next configuration (Hnext, Vnext)
1. Compute neighborhood N(H,V)
2. For each (H', V') in N:
a. Estimate L(H', V'), T(H', V'), K(H', V'), C(H', V')
b. If SLA violated, mark as infeasible and continue
c. Compute objective F(H', V')
d. Compute rebalancing penalty Prebalance(H,V; H', V')
e. Set F'(H', V') = F(H', V') + δPrebalance
3. Select feasible neighbor (H*, V*) minimizing F'
4. If F'(H*, V*) < F(H,V) - ε:
Return (H*, V*)
Else:
Return (H,V)
While the paper does not explicitly list traditional ablation studies, implicit ablation is performed through comparison of three strategies (H-only, V-only, Diagonal):
Without diagonal movement (H-only + V-only): Significant performance degradation
Without stability penalty: Leads to more frequent oscillation (controlled by δ parameter)
Different neighborhood sizes: 8-neighbor configuration balances exploration and computational cost