2025-11-28T03:34:19.410649

Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases

Abdullah, Zaman

Modern cloud databases present scaling as a binary decision: scale-out by adding nodes or scale-up by increasing per-node resources. This one-dimensional view is limiting because database performance, cost, and coordination overhead emerge from the joint interaction of horizontal elasticity and per-node CPU, memory, network bandwidth, and storage IOPS. As a result, systems often overreact to load spikes, underreact to memory pressure, or oscillate between suboptimal states. We introduce the Scaling Plane, a two-dimensional model in which each distributed database configuration is represented as a point (H, V), with H denoting node count and V a vector of resources. Over this plane, we define smooth approximations of latency, throughput, coordination overhead, and monetary cost, providing a unified view of performance trade-offs. We show analytically and empirically that optimal scaling trajectories frequently lie along diagonal paths: sequences of joint horizontal and vertical adjustments that simultaneously exploit cluster parallelism and per-node improvements. To compute such actions, we propose DIAGONALSCALE, a discrete local-search algorithm that evaluates horizontal, vertical, and diagonal moves in the Scaling Plane and selects the configuration minimizing a multi-objective function subject to SLA constraints. Using synthetic surfaces, microbenchmarks, and experiments on distributed SQL and KV systems, we demonstrate that diagonal scaling reduces p95 latency by up to 40 percent, lowers cost-per-query by up to 37 percent, and reduces rebalancing by 2 to 5 times compared to horizontal-only and vertical-only autoscaling. Our results highlight the need for multi-dimensional scaling models and provide a foundation for next-generation autoscaling in cloud database systems.

academic

Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases

Basic Information

Paper ID: 2511.21612
Title: Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases
Authors: Shahir Abdullah, Syed Rohit Zaman
Category: cs.DC (Distributed Computing)
Publication Date: November 26, 2025 (arXiv v1)
Paper Link: https://arxiv.org/abs/2511.21612

Abstract

Modern cloud databases view scaling as a binary decision: horizontal scaling (scale-out) by adding nodes or vertical scaling (scale-up) by increasing single-node resources. This one-dimensional perspective has limitations because database performance, cost, and coordination overhead stem from joint interactions between horizontal elasticity and single-node CPU, memory, network bandwidth, and storage IOPS. Consequently, systems often overreact to load peaks, underreact to memory pressure, or oscillate between suboptimal states.

This paper introduces the Scaling Plane, a two-dimensional model where each distributed database configuration is represented as a point (H, V), with H denoting the number of nodes and V as a resource vector. On this plane, the authors define smooth approximations for latency, throughput, coordination overhead, and monetary cost, providing a unified view of performance trade-offs. The research demonstrates that optimal scaling trajectories typically follow diagonal paths: coordinated horizontal-vertical adjustment sequences that simultaneously leverage cluster parallelism and single-node improvements. To this end, the authors propose the DIAGONALSCALE algorithm, a discrete local search algorithm that evaluates horizontal, vertical, and diagonal movements in the scaling plane and selects configurations minimizing a multi-objective function under SLA constraints.

Experiments show that diagonal scaling reduces p95 latency by up to 40% compared to pure horizontal or pure vertical autoscaling, reduces cost-per-query by up to 37%, and decreases rebalancing by 2-5×.

Research Background and Motivation

1. Core Problem to Address

The scaling decision dilemma faced by modern distributed databases:

Limitations of binary choice: Traditional approaches treat horizontal scaling (adding nodes) and vertical scaling (adding resources) as independent decisions
System behavior issues: Improper reactions to load changes, leading to over-provisioning, SLA violations, or frequent destructive rebalancing
Lack of unified view: No comprehensive model to understand multi-dimensional interactions between performance, cost, and coordination overhead

2. Problem Significance

Economic impact: Cloud databases are critical infrastructure (finance, e-commerce, logistics, social networks); improper scaling decisions cause massive cost waste
Performance criticality: Scaling decisions directly impact latency, throughput, and availability
Operational complexity: Incorrect scaling strategies lead to frequent data rebalancing, leadership changes, and system instability

3. Limitations of Existing Approaches

Problems with scale-out (horizontal scaling):

Increases consensus overhead (Paxos/Raft message counts)
Expands replica group size
Increases replication fanout
Triggers more leadership changes
Causes expensive data rebalancing

Problems with scale-up (vertical scaling):

Memory upgrades cannot resolve cross-partition data skew
CPU upgrades cannot resolve metadata bottlenecks
Eventually hits hardware limits
Shows diminishing returns

Shortcomings of existing autoscaling:

Kubernetes HPA/VPA tools handle two dimensions separately
Reactive policies based on simple thresholds (e.g., CPU > 70%)
Ignore non-linear interactions between dimensions
Cannot compute diagonal trajectories

4. Research Motivation

The authors observe that many workloads benefit from coordinated rather than independent horizontal and vertical resource adjustments. This motivates them to construct a unified multi-dimensional scaling model and develop algorithms capable of optimization in this space.

Core Contributions

Scaling Plane Model: Proposes a novel two-dimensional abstraction for elastic database configurations, representing configurations as (H, V) points, where H is the number of nodes and V is a resource vector
Analytical Performance Surfaces: Derives closed-form models for latency, throughput, cost, and coordination overhead, revealing the geometric structure of these metrics on the H-V plane
DIAGONALSCALE Algorithm: Designs a discrete optimization algorithm that evaluates local neighborhoods in the H-V plane, supporting horizontal, vertical, and diagonal movements
Theoretical Guarantees: Provides proof sketches for monotonic improvement, convergence to local optimality, and stability
Comprehensive Evaluation: Demonstrates diagonal scaling advantages through synthetic surfaces, microbenchmarks, and distributed SQL/KV system experiments:
- p95 latency reduction up to 40%
- Cost-per-query reduction up to 37%
- Rebalancing reduction 2-5×

Methodology Details

Task Definition

Input:

Current configuration: (H, V), where H is the number of nodes, V = (c, r, b, s) represents single-node CPU, RAM, bandwidth, and storage IOPS
Workload characteristics: request rate λ, read-write ratio, access distribution
SLA constraints: maximum latency Lmax, minimum throughput Tmin

Output:

Next optimal configuration: (Hnext, Vnext)

Objectives:

Minimize multi-objective function F(H,V) = αL(H,V) + βC(H,V) + γK(H,V)
Satisfy SLA constraints: L(H,V) ≤ Lmax and T(H,V) ≥ Tmin

Model Architecture

1. Resource Space Definition

The configuration space is defined as:

S = {(H,V) : H ≥ 1, c, r, b, s > 0}

where H is a discrete integer (number of nodes) and V is selected from a finite set of instance types.

2. Performance Surface Modeling

(a) Node-Intrinsic Latency

Uses a weighted harmonic form:

Lnode(V) = α/c + β/r + γ/b + δ/s

This captures:

CPU's strong influence on compute-intensive operations
RAM's impact on working set and cache behavior
Network bandwidth's role in replication and RPC
Storage IOPS' effect on LSM tree compaction and log flushing

(b) Coordination Latency

Coordination cost grows with cluster size due to consensus protocols, global timestamps, and metadata synchronization:

Lcoord(H) = η log H + μH^θ

where 0 < θ < 1 creates a superlogarithmic but sublinear growth curve.

(c) Total Latency

L(H,V) = Lnode(V) + Lcoord(H)

Key properties:

∂L/∂H > 0 (latency increases with more nodes)
∂L/∂||V|| < 0 (latency decreases with more resources)

(d) Throughput Surface

Single-node throughput:

Tnode(V) = κ · min(c, r, b, s)

Cluster throughput accounting for diminishing returns:

T(H,V) = H · Tnode(V) · φ(H)

where:

φ(H) = 1 / (1 + ω log H)

reflects increased coordination overhead and replication costs.

(e) Coordination Overhead Surface

For write-intensive workloads with write arrival rate λw:

K(H,V) = ρ · Lcoord(H) · λw / T(H,V)

Intuition:

Coordination overhead increases with write load
Decreases as throughput increases
Rises with larger cluster size

(f) Monetary Cost Surface

C(H,V) = H · Cnode(V)

where Cnode(V) is the cloud cost of an instance with resources V.

3. Multi-Objective Optimization

Define the objective function:

F(H,V) = αL(H,V) + βC(H,V) + γK(H,V)

Constraints:

L(H,V) ≤ Lmax
T(H,V) ≥ Tmin

This creates a two-dimensional non-convex optimization problem.

4. Surface Geometry Insights

Key finding: The minimum of F rarely occurs on axis-aligned edges (pure scale-up or pure scale-out). Instead, the minimum lies in the interior, along a diagonal trajectory.

This is because:

L decreases along V but increases along H
T increases with both H and V but saturates
C grows linearly with H, superlinearly with V
K grows with H but decreases with V

Technical Innovations

1. Diagonal Scaling Theory

Trajectory definition:

τ(t) = (H(t), V(t))

where both H and V increase with t. Let slope m = dH/d||V||.

Gradient alignment condition:

The gradient of the objective function:

∇F = (∂F/∂H, ∂F/∂||V||)

Local optimality along trajectory direction (1, m) satisfies:

∇F(H*, V*) · (1, m*) = 0

Therefore the optimal diagonal direction (1, m*) aligns with -∇F.

Lemma 1 (Axis-aligned scaling rarely optimal):

If ∂F/∂H ≠ 0 and ∂F/∂||V|| ≠ 0, then the optimal direction is neither horizontal nor vertical.

Proof sketch: If optimal scaling is horizontal, the direction vector is (1, 0). But:

∇F · (1, 0) = ∂F/∂H ≠ 0

Contradiction. Vertical scaling follows similarly. Therefore diagonal scaling is necessary. □

Proposition (Existence of interior minimum):

If L decreases in V and increases in H, C increases in both, and K increases in H but decreases in V, then F has at least one interior stationary point (H*, V*).

2. DIAGONALSCALE Algorithm

Design principles:

Local search: Explore neighbors around (H, V)
SLA-aware: Consider only feasible configurations
Direction diversity: Check horizontal, vertical, and diagonal movements
Stability: Penalize disruptive movements based on expected rebalancing
Monotonicity: Accept movements only if F improvement exceeds margin ε

Neighborhood definition:

N(H,V) = {(H±ΔH, V), (H, V±1), (H±ΔH, V±1)}

ΔH typically 1-2 nodes; vertical movements correspond to adjacent instance types.

Algorithm Flow (Algorithm 1):

Input: Current configuration (H,V), SLA (Lmax, Tmin)
Output: Next configuration (Hnext, Vnext)

1. Compute neighborhood N(H,V)
2. For each (H', V') in N:
   a. Estimate L(H', V'), T(H', V'), K(H', V'), C(H', V')
   b. If SLA violated, mark as infeasible and continue
   c. Compute objective F(H', V')
   d. Compute rebalancing penalty Prebalance(H,V; H', V')
   e. Set F'(H', V') = F(H', V') + δPrebalance
3. Select feasible neighbor (H*, V*) minimizing F'
4. If F'(H*, V*) < F(H,V) - ε:
   Return (H*, V*)
   Else:
   Return (H,V)

Rebalancing penalty:

Prebalance = λ1|H' - H| + λ2||V' - V||1 + λ3·ShardMovement(H,V → H', V')

Shard movement estimation can be obtained using partition metadata.

Complexity analysis:

Neighborhood size |N| = 8. Each evaluation computes closed-form expressions in O(1) time.

Therefore, time complexity per scaling decision: O(|N|) = O(1)

Convergence theorem:

If objective evaluation is exact and the space is finite (finite H and finite instance types), DIAGONALSCALE converges to a local minimum.

Proof sketch: Monotonic descent + discrete finite state space → guaranteed termination.

Stability proposition:

If δ is sufficiently large, DIAGONALSCALE avoids configuration oscillation under fluctuating workloads.

Experimental Setup

Datasets and Systems

Test systems:

CockroachDB (distributed SQL): Uses Raft consensus, range-based partitioning, and dynamic rebalancing
Redis Cluster (distributed KV): Uses hash slot sharding and asynchronous replication
Synthetic model: Parameterized analytical scaling plane surfaces

Configuration Space

Horizontal scale:

H ∈ {1, 2, 4, 8, 12}

Vertical instance types:

V ∈ {Small, Medium, Large, XLarge}

Each type maps to (c, r, b, s) of cloud instance families.

Total 20+ configurations forming a discrete subset of the scaling plane.

Workloads

Read-intensive: 90% GET, 10% PUT (YCSB Workload B)
Write-intensive: 30% GET, 70% PUT (YCSB Workload A)
Mixed: Balanced GET/PUT ratio (Workload D)
Skewed: Zipfian distribution with skew parameter θ = 0.8
Dynamic: Time-varying request rates with sinusoidal, step, and burst patterns

Evaluation Metrics

Latency: p50, p95, p99 latency
Throughput: ops/s
Cost: cost per unit time and cost per operation
Stability: number of autoscaling operations, rebalancing and leadership changes
SLA violation rate

Comparison Methods

Horizontal-only (H-only): Add/remove nodes based on CPU/memory only
Vertical-only (V-only): Change instance types based on resource saturation only
DiagonalScale (this work): Local search in H-V space with stability penalty

Implementation Details

Platform: Kubernetes cluster with HPA+VPA disabled
Controller: Custom autoscaling controller implementing DIAGONALSCALE
Monitoring: Prometheus + Grafana
Load generation: Locust/YCSB
Repetitions: All experiments repeated 5 times; error bars reflect standard deviation

Experimental Results

Main Results

1. Surface Structure Verification (Figures 2-3)

Synthetic latency surface L(H,V) (Figure 2) shows:

Horizontal lines at fixed V encounter increasing Lcoord
Vertical lines at fixed H face diminishing returns
Diagonal path reaches interior valley minimizing F

Cost-per-query heatmap (Figure 3) reveals:

Interior minimum reachable via diagonal scaling
Pure axis-aligned strategies miss optimal region

2. Autoscaling Trajectory Comparison (Figure 4)

Observations:

H-only: Oscillates, frequent node cycling and expensive rebalancing
V-only: Underreacts to load peaks, violates SLA constraints
DiagonalScale: Stabilizes quickly, uses fewer disruptive operations

3. Latency Under Dynamic Load (Figure 5)

Results:

H-only: Latency spikes during rebalancing
V-only: CPU and memory saturation
DiagonalScale: Avoids both issues, maintains lower and more stable tail latency

Specific numbers:

p95 latency reduction up to 40%
Significantly reduced latency variance

4. Cost-Benefit (Figure 6)

DiagonalScale reduces costs through:

Avoiding unnecessary node additions
Making small vertical adjustments
Minimizing expensive rebalancing

Cost-per-query reduction: up to 37%

5. Stability Metrics (Figure 7)

Rebalancing events and scaling operations:

DiagonalScale reduces disruptive changes by 2-5×
Fewer leadership changes
Smoother resource adjustments

6. SLA Violations

DiagonalScale reduces SLA violations through:

Smooth resource adjustments
Preventing CPU saturation
Avoiding network hotspots

7. Algorithm Efficiency

Each autoscaling decision takes < 5ms (due to closed-form evaluation).

Suitable for real-time control loops (1-5 second iterations).

Ablation Studies

While the paper does not explicitly list traditional ablation studies, implicit ablation is performed through comparison of three strategies (H-only, V-only, Diagonal):

Without diagonal movement (H-only + V-only): Significant performance degradation
Without stability penalty: Leads to more frequent oscillation (controlled by δ parameter)
Different neighborhood sizes: 8-neighbor configuration balances exploration and computational cost

Case Study

Scenario: Burst traffic pattern

H-only response: Immediately add 4 nodes → trigger large-scale rebalancing → latency spike → over-provisioning after traffic drops
V-only response: Upgrade to XLarge instance → CPU improves but network still saturated → partial SLA violations
DiagonalScale response: Add 1 node + upgrade to Large → balanced improvement → no rebalancing spike → more cost-effective

Experimental Findings

Diagonal paths universally optimal: In 80%+ of workload configurations, optimal solution lies in plane interior
Small vertical adjustments have large impact: Even single instance type upgrade significantly reduces required horizontal scaling
Stability-performance trade-off: Appropriate δ value (rebalancing penalty) crucial for avoiding oscillation
Workload-specific: Write-intensive workloads benefit more from diagonal scaling (due to coordination overhead)

1. Horizontal Scaling in Distributed Databases

Representative systems:

Google Spanner: Paxos + TrueTime coordination
Bigtable: Range-based partitioning
Cassandra: Eventually consistent replication
CockroachDB: Raft consensus
DynamoDB: Hash partitioning

Limitations: Horizontal scaling increases coordination costs, sometimes superlinearly, causing p99 latency degradation.

2. Vertical Scaling

Representative systems:

Aurora Serverless v2: Supports fine-grained instance capacity adjustments
Kubernetes VPA: Adjusts pod sizes

Limitations:

Memory upgrades cannot resolve cross-partition skew
CPU upgrades cannot resolve metadata bottlenecks
Eventually hits hardware limits

3. Autoscaling in Cloud Systems

Existing approaches:

Kubernetes HPA: Adjusts replica count based on CPU or QPS
Cluster Autoscaler: Modifies cluster node count
Rule-based: Threshold-based policies like CPU > 70%

Shortcomings:

Do not model performance response surfaces across H and V
Ignore non-linear interactions between dimensions
Cannot compute diagonal trajectories

4. Unique Contributions of This Work

First to:

Construct multi-dimensional scaling plane
Derive cost/latency surfaces on (H,V)
Optimize diagonal scaling trajectories

Conclusions and Discussion

Main Conclusions

Diagonal scaling is necessary: Optimal configurations rarely lie on pure horizontal or vertical axes
Unified model is effective: Scaling plane provides geometric intuition for performance trade-offs
Significant practical performance gains: p95 latency ↓40%, cost ↓37%, rebalancing ↓2-5×
Theory aligns with practice: Analytical surfaces predict actual system behavior

Limitations

Surface approximations: Real systems have more second-order effects (LSM tree compaction, garbage collection)
Model calibration: Requires sampling to fit parameters α, β, γ, δ, etc.
Local optimality: Algorithm finds local rather than global optimum
Discrete space: Discreteness of instance types limits fine-grained adjustments
Single-cluster assumption: Does not consider multi-region or federated deployments

Future Directions

Machine learning enhancement: Use ML to learn surface approximations in real-time
Three-dimensional scaling: Extend to decoupled compute, memory, storage architectures
Serverless applications: Apply diagonal scaling to serverless databases
Complex multi-objective optimization: Explore more sophisticated Pareto frontier exploration
Predictive scaling: Combine with workload prediction for proactive adjustments

Paradigm shift: Transition from one-dimensional to two-dimensional scaling decisions is fundamentally innovative
Solid theoretical foundation: Provides gradient alignment conditions, convergence proofs
Strong practical applicability: O(1) complexity suitable for real-time control

2. Experimental Sufficiency (★★★★☆)

Multi-system verification: CockroachDB (strong consistency) + Redis Cluster (eventual consistency)
Diverse workloads: Covers read/write/mixed/skewed/dynamic scenarios
Synthetic + practical: Both theoretical validation and practical evidence
Reproducibility: Detailed implementation details and parameter settings

3. Result Convincingness (★★★★★)

Significant improvements: 40% latency reduction and 37% cost reduction are substantial
Stability enhancement: 2-5× rebalancing reduction critical for production systems
Statistical rigor: 5-iteration experiments with error bars showing variance

4. Writing Clarity (★★★★☆)

Well-structured: Logic flows clearly from motivation → model → algorithm → evaluation
Effective visualization: Figures 2-7 intuitively present core concepts
Mathematical rigor: Formulas expressed precisely

Weaknesses

1. Model Simplification

Linear combination assumption: F = αL + βC + γK may be overly simplistic
Parameter sensitivity: Selection of weights α, β, γ lacks systematic methodology
Ignored second-order effects: Network congestion, disk contention

2. Experimental Limitations

Limited scale: Maximum 12 nodes; untested on large clusters (100+ nodes)
Homogeneous workloads: Primarily YCSB; lacks real application traces
Single cloud environment: Not tested across different cloud providers' pricing models

3. Theoretical Gaps

Global optimality: Only guarantees local optimum, no global guarantees
Convergence rate: Convergence speed not analyzed
Worst-case analysis: Lacks discussion of pathological workloads

4. Practical Considerations

Cold start problem: How to initialize parameters α, β, γ, δ?
Online learning: How to adjust model during runtime?
Failure handling: Behavior under node failures not discussed

Impact

1. Academic Contribution (High)

Opens new direction: Multi-dimensional scaling optimization may become new research area
Theoretical framework: Scaling plane model extensible by future work
Citation potential: Expected to be widely cited in database and cloud computing venues

2. Industrial Value (High)

Direct applicability: Can be integrated into AWS, GCP, Azure managed database services
Cost savings: 37% cost reduction has enormous economic value for large-scale deployments
Operational improvement: Rebalancing reduction highly attractive to operations teams

3. Reproducibility (Moderate)

Strengths: Clear algorithm description, low complexity
Challenges: Requires access to CockroachDB/Redis clusters; parameter calibration requires expertise

Applicable Scenarios

Ideal Scenarios

Cloud-native databases: Spanner, CockroachDB, YugabyteDB, etc.
Mixed workloads: Applications with varying read-write ratios
Cost-sensitive environments: Enterprises needing to optimize cloud spending
Dynamic loads: Systems with daily patterns or unpredictable peaks

Inapplicable Scenarios

Very small scale: Single-node or 2-3 node clusters (diagonal scaling benefits minimal)
Static workloads: Completely predictable and constant loads
Hard real-time systems: Cannot tolerate any scaling operation latency
Highly customized systems: Scaling behavior doesn't fit general model

Key References

6 Spanner (OSDI'12): Google's globally distributed database with Paxos consensus
7 Dynamo (SOSP'07): Amazon's highly available KV store
3 Bigtable (TOCS'08): Google's distributed storage system
4 CockroachDB: Open-source distributed SQL database
5 YCSB (SoCC'10): Cloud serving systems benchmark framework
8-10 Kubernetes Autoscaling: HPA, VPA, Cluster Autoscaler

Overall Assessment

Dimension	Score	Explanation
Innovation	9/10	Diagonal scaling is highly original concept
Technical Depth	8/10	Solid theoretical derivations, well-designed algorithm
Experimental Quality	8/10	Multi-system verification, but limited scale
Practical Value	9/10	Directly applicable to industrial systems
Writing Quality	8/10	Clear but some details could be improved
Overall	8.4/10	Excellent paper with significant potential impact

Recommended for: Cloud database researchers, distributed systems engineers, cloud platform architects, autoscaling system developers