Prioritizing Latency with Profit: A DRL-Based Admission Control for 5G Network Slices
Chakraborty, Asrar, Sengupta et al.
5G networks enable diverse services such as eMBB, URLLC, and mMTC through network slicing, necessitating intelligent admission control and resource allocation to meet stringent QoS requirements while maximizing Network Service Provider (NSP) profits. However, existing Deep Reinforcement Learning (DRL) frameworks focus primarily on profit optimization without explicitly accounting for service delay, potentially leading to QoS violations for latency-sensitive slices. Moreover, commonly used epsilon-greedy exploration of DRL often results in unstable convergence and suboptimal policy learning. To address these gaps, we propose DePSAC -- a Delay and Profit-aware Slice Admission Control scheme. Our DRL-based approach incorporates a delay-aware reward function, where penalties due to service delay incentivize the prioritization of latency-critical slices such as URLLC. Additionally, we employ Boltzmann exploration to achieve smoother and faster convergence. We implement and evaluate DePSAC on a simulated 5G core network substrate with realistic Network Slice Request (NSLR) arrival patterns. Experimental results demonstrate that our method outperforms the DSARA baseline in terms of overall profit, reduced URLLC slice delays, improved acceptance rates, and improved resource consumption. These findings validate the effectiveness of the proposed DePSAC in achieving better QoS-profit trade-offs for practical 5G network slicing scenarios.
academic
Prioritizing Latency with Profit: A DRL-Based Admission Control for 5G Network Slices
This paper proposes DePSAC (Delay and Profit-aware Slice Admission Control), a deep reinforcement learning-based solution for admission control in 5G network slicing. The scheme simultaneously maximizes network service provider (NSP) profit while explicitly considering service latency, with particular emphasis on prioritizing ultra-reliable low-latency communication (URLLC) slices. The approach employs a delay-aware reward function and Boltzmann exploration strategy, validated on a simulated 5G core network demonstrating improvements over the baseline DSARA method in profit, latency, acceptance rate, and resource consumption.
5G networks support diverse services through network slicing technology, including enhanced mobile broadband (eMBB), ultra-reliable low-latency communication (URLLC), and massive machine-type communication (mMTC). These services have heterogeneous QoS requirements, necessitating intelligent admission control and resource allocation strategies to balance strict QoS demands with NSP profitability.
While the baseline DSARA method effectively maximizes profit, it fails to account for latency differences across slice types, potentially causing QoS violations. This work aims to develop a slice admission control scheme that simultaneously considers both latency and profit.
Delay-Aware Reward Function: Proposes a profit-delay-aware reward formula balancing QoS requirements and NSP profitability
Boltzmann Exploration Strategy: Integrates Boltzmann exploration into the DRL agent, improving learning stability and avoiding local optima inherent to epsilon-greedy methods
Comprehensive Experimental Evaluation: Implements DePSAC on a simulated 5G core network using realistic network slice request arrival patterns
Performance Improvement Verification: Experimental results validate DePSAC's improvements in profit-QoS trade-offs, achieving shorter service latency, higher acceptance rates, and lower bandwidth utilization
Latency Penalty Mechanism: Introduces latency penalty terms in the reward function, incentivizing the agent to prioritize latency-sensitive slices
Smooth Exploration Strategy: Boltzmann exploration selects actions based on Q-value probability distributions, avoiding purely random or greedy behavior
Overall Profit: DePSAC shows slightly lower profit than DSARA during early training due to exploration, but consistently outperforms baseline as training progresses
Categorical Profit: Profit improvements across all service types (eMBB, URLLC, mMTC), with URLLC showing most significant gains
The paper validates the effectiveness of delay-aware reward function and Boltzmann exploration through comparison with DSARA, though detailed component-level ablation analysis is not provided.
Compared to existing work, this paper is the first to explicitly consider both latency and profit in a DRL framework while employing a more stable exploration strategy.
The paper cites 12 relevant references covering key areas including 5G network slicing, deep reinforcement learning, and resource allocation, providing sufficient theoretical foundation and comparison benchmarks.
Overall Assessment: This paper addresses the latency-profit trade-off problem in 5G network slice admission control with an innovative and practical solution. The method design is sound, experimental validation is comprehensive, and the work demonstrates good academic value and application prospects in this field. Main areas for improvement include theoretical analysis and practical deployment considerations.