2025-11-22T19:13:16.751799

A protocol to reduce worst-case latency in deflection-based on-chip networks

Indrusiak
We present a novel protocol that reduces worst-case packet latency in deflection-based on-chip interconnect networks. It enforces the deflection of the header of a packet but not its payload, resulting in a reduction in overall network traffic and, more importantly, worst-case packet latency due to decreased pre-injection latency.
academic

A protocol to reduce worst-case latency in deflection-based on-chip networks

Basic Information

  • Paper ID: 2510.11361
  • Title: A protocol to reduce worst-case latency in deflection-based on-chip networks
  • Author: Leandro Soares Indrusiak (University of Leeds)
  • Classification: cs.NI (Networking and Internet Architecture), cs.PF (Performance)
  • Publication Date: October 13, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.11361

Abstract

This paper proposes a novel protocol to reduce worst-case packet latency in deflection-based on-chip interconnection networks. The protocol deflects only the header of deflected packets rather than their payload, thereby reducing overall network traffic and, more importantly, lowering worst-case packet latency by reducing pre-injection delay.

Research Background and Motivation

Problem Definition

  1. Core Problem: In deflection-based on-chip networks (NoC), traditional full-packet deflection mechanisms cause severe latency issues, particularly failing to meet performance guarantees required in embedded real-time systems.
  2. Problem Significance:
    • Although deflection-based networks offer resource efficiency advantages (reducing chip area by 85%, decreasing power consumption by 10×), they significantly impact network latency
    • Embedded multiprocessor platforms require performance guarantees, with increased latency directly affecting end-to-end performance
    • Existing analytical models demonstrate that deflection substantially increases worst-case latency
  3. Limitations of Existing Approaches:
    • Traditional deflection routing sends entire packets (including headers and payloads) along alternative paths during congestion
    • This approach increases unnecessary network traffic, especially when packets need to return to injection points for retry
    • Existing optimization methods primarily focus on changing routing or reducing deflection frequency, without considering traffic optimization during deflection
  4. Research Motivation:
    • Observation that deflected packets always pass through their injection switch before reaching the destination again
    • Proposal to deflect only headers while discarding payloads, with payload re-injection at the injection point
    • Goal to reduce network interference and improve overall worst-case latency

Core Contributions

  1. Novel Deflection Protocol: Innovatively separates packet headers and payloads, deflecting only headers while discarding payloads
  2. Theoretical Analysis Framework: Modifies existing worst-case latency analysis framework to quantify performance improvements of the new protocol
  3. Performance Guarantees: Theoretically proves the new protocol outperforms traditional methods in worst-case latency
  4. Comprehensive Experimental Validation: Demonstrates protocol effectiveness across different scenarios through application-specific and large-scale synthetic evaluations

Methodology Details

Task Definition

Input: Router-less on-chip networks with ring topology employing full-packet deflection routing Output: Improved deflection protocol reducing worst-case packet latency Constraints: Maintain original network functionality, no additional buffering overhead, satisfy real-time system performance requirements

Protocol Architecture

Traditional Deflection Mechanism

In traditional protocols, when a packet cannot be ejected at the destination switch (e.g., ejection link is occupied), the entire packet is deflected and transmitted along the ring until reaching the destination again.

New Protocol Design

  1. Header Deflection: Only packet headers continue transmission in the ring during deflection
  2. Payload Discard: Packet payloads are completely discarded at the deflection point
  3. Payload Re-injection: When modified headers return to the injection switch, payload re-injection is triggered
  4. Header Modification: The destination switch modifies header fields during deflection to identify re-injection requirements

Key Technical Details

Buffer Management:

  • Packet payloads cannot be immediately deleted from injection buffers after injection
  • Employs SAFC or SAMQ buffering techniques to manage re-injection requirements
  • Under deadline constraints (Di ≤ Ti), no additional buffer memory is required

Header Identification Mechanism:

  • Injection switches must identify deflected headers
  • Headers contain destination switch and unique identifiers for packet flows
  • Re-injection triggering is implemented through field modification

Traffic Optimization:

Traditional approach: Full packet deflection = Header(H) + Payload(L-H)
New protocol: Header-only deflection = Header(H)
Traffic reduction = (L-H) × deflection count × return path length

Technical Innovations

  1. Separation-based Deflection Strategy:
    • Breaks away from traditional full-packet deflection paradigm
    • Leverages ring network topology characteristics (deflected packets must pass through injection point)
    • Enables in-place payload reuse
  2. Interference Reduction Mechanism:
    • Does not directly reduce deflection frequency for individual packets
    • Reduces interference on other packets through decreased network traffic
    • Focuses on optimizing pre-injection delay (Ipre)
  3. Backward-compatible Design:
    • Compatible with existing deflection reduction techniques
    • Maintains consistency with original network behavior
    • Supports end-to-end acknowledgment mechanism extensions

Experimental Setup

Datasets

  1. Application-specific Evaluation:
    • Uses 39-flow autonomous vehicle (AV) benchmark
    • Configuration: VGA resolution camera, 8-bit color, 25fps
    • Generates 100 random mappings to avoid bias
  2. Large-scale Synthetic Evaluation:
    • Each benchmark contains 100 randomly generated flow sets
    • Flow counts ranging from 20 to 280
    • Parameter ranges: periods 1-100 microseconds, jitter 0-50% of period, packet sizes 16-48 or 32-96 flits

Network Configuration

  • Network Topology: 4×4, 5×5, 6×6, 7×7, 8×8, 9×9 core networks
  • Technical Parameters: 32-bit flits, 1-flit headers, 1GHz clock frequency
  • Deflection Settings: Maximum 0-3 deflections, using oldest-first livelock prevention mechanism

Evaluation Metrics

  1. Worst-case Latency Reduction Percentage: Latency improvement of new protocol relative to baseline
  2. Schedulability Ratio: Percentage of fully schedulable cases in benchmark sets
  3. Pooled Average Improvement: Average improvement across all flows and mappings

Comparison Methods

  • Baseline Method: Traditional router-less network protocol proposed by Alazemi et al.
  • Analysis Method: Worst-case latency analysis model by Indrusiak and Burns

Experimental Results

Main Results

Application-specific Evaluation Results

Network Size4×45×56×67×78×89×9
Maximum Improvement (%)93.0789.4589.2689.3383.3680.66
Pooled Average (%)6.603.333.202.642.160.92

Key Findings:

  • All network topologies benefit from the new protocol
  • Maximum improvements reach 93%, primarily in flows with severe pre-injection interference
  • Average improvement gradually decreases with network size (more uniform traffic distribution)

Large-scale Synthetic Evaluation Results

The new protocol significantly outperforms the baseline in schedulability ratio:

  • Single Deflection Scenario: Schedulability improvement exceeds 20%
  • Multiple Deflection Scenario: Best performance under moderate loads
  • Network Scale Impact: Smaller networks (4×4) show more pronounced improvements
  • Packet Size Impact: Larger packets (32-96 flits) drive networks toward saturation

Ablation Studies

Comparative analysis with different deflection counts (0-3):

  • 0 Deflections: New protocol identical to baseline (correctness verification)
  • 1 Deflection: New protocol advantages most pronounced
  • Multiple Deflections: Improvement degree decreases with increasing deflection count

Theoretical Analysis Verification

Modified pre-injection idle time analysis formula:

Original formula involves complete packet lengths Lj of all flows
New protocol formula replaces payload length of deflected flows with header length H
Since Lj > H, new protocol theoretically must outperform original method

Experimental Findings

  1. Optimal Improvement Conditions: Scenarios with severe pre-injection interference and networks not fully saturated
  2. Scale Effects: RLrec algorithm generates more small loops in large networks, limiting improvement space
  3. Load Sensitivity: Improvement effects correlate positively with packet size and deflection frequency

Deflection Routing Research Directions

  1. Flit-level Deflection: Each flit makes independent deflection decisions, requiring reordering mechanisms
  2. Packet-level Deflection: Entire packets deflect uniformly, preserving flit order
  3. Hybrid Methods: Strategies combining buffering and deflection

Paper Positioning

  • Technical Route: Selects packet-level deflection for resource efficiency advantages
  • Innovation Angle: First to propose header-payload separation deflection mechanism
  • Analysis Contribution: Extends existing worst-case latency analysis framework
  • Routing Optimization: Methods like DARES that change routing paths
  • Hardware Optimization: Buffer design and arbitration strategy improvements
  • Topology Optimization: Ring configuration and switch design optimization

Conclusions and Discussion

Main Conclusions

  1. Theoretical Advantages: New protocol theoretically outperforms all existing methods in worst-case latency
  2. Practical Effectiveness: Achieves significant latency reduction and schedulability improvements across multiple scenarios
  3. Implementation Feasibility: Requires no additional hardware overhead, implementable using existing buffering techniques
  4. Application Value: Particularly suitable for hard real-time systems' performance guarantee requirements

Limitations

  1. Topology Constraints: Primarily applicable to ring network topologies
  2. Improvement Attenuation: Limited improvement effects in large networks or high-load scenarios
  3. Implementation Complexity: Requires modifications to injection buffer management and header identification mechanisms
  4. Evaluation Scope: Does not quantify average-case latency and energy consumption improvements

Future Directions

  1. Topology Extension: Explore applicability to other network topologies
  2. Performance Quantification: Evaluate average latency and energy consumption improvements
  3. Hardware Implementation: Develop concrete hardware implementation schemes and prototype validation
  4. Protocol Optimization: Combined optimization with other deflection reduction techniques

In-depth Evaluation

Strengths

  1. Strong Innovation: Header-payload separation deflection concept demonstrates originality and inspiration
  2. Theoretical Rigor: Provides complete mathematical analysis framework and theoretical proofs
  3. Comprehensive Experiments: Encompasses both application-specific and large-scale synthetic evaluation methods
  4. High Practical Value: Addresses critical performance issues in real-time systems
  5. Clear Writing: Accurate technical descriptions and logical structure

Weaknesses

  1. Limited Application Scope: Primarily targets ring networks; applicability to other topologies unclear
  2. Insufficient Implementation Details: Specific encoding methods for header modification and hardware implementation details lacking
  3. Limited Baseline Comparisons: Primarily compares with one baseline method; lacks comparison with other optimization techniques
  4. Single Evaluation Metric: Focuses on worst-case latency; insufficient analysis of average performance and energy consumption impacts

Impact

  1. Academic Contribution: Provides new research direction for deflection-based network optimization
  2. Practical Value: Directly applicable to NoC design in embedded real-time systems
  3. Reproducibility: Detailed analysis models and experimental settings facilitate reproduction and extension
  4. Inspirational Significance: Separation-based approach may inspire other network optimization research

Applicable Scenarios

  1. Hard Real-time Systems: Embedded applications requiring strict latency guarantees
  2. Resource-constrained Environments: NoC designs sensitive to area and power consumption
  3. Ring Network Architectures: NoC systems employing ring topologies
  4. Medium-scale Networks: 4×4 to 6×6 networks achieve optimal improvement effects

References

This paper cites 15 related studies, primarily including:

  • 1 Alazemi et al.'s router-less network architecture
  • 6 Indrusiak and Burns' worst-case latency analysis
  • 8 Liu et al.'s IMR ring network design
  • Other related work on deflection routing, real-time analysis, and NoC optimization

Overall Assessment: This is a high-quality systems architecture paper proposing an innovative deflection routing optimization protocol with solid theoretical foundations and comprehensive experimental validation. While having certain limitations in application scope and implementation details, its core ideas possess significant academic value and practical importance, providing new directions for on-chip network optimization research.