|
低轨卫星无线接入网切片资源分配研究
|
Abstract:
随着低轨卫星网络规模扩大及其业务多样化发展,如何在动态网络环境下实现高效的切片资源分配成为亟待解决的关键问题。针对这一挑战,本文就低轨卫星无线接入网切片资源分配问题,提出一种基于模型强化学习的动态优化策略。本文将低轨卫星无线接入网切片资源分配问题建模成了一个控制问题。这个控制问题要求分配策略运行在的网络中在线学习,并且在学习过程中保持服务水平协议违约率低于阈值。在此基础上,本文提出了基于模型的强化学习的资源分配策略。策略包含一个基于核方法的分类器和一个用于控制模型预测错误率的模型自评估机制。实验结果表明,与其他强化学习策略相比,本文的策略在资源利用率、服务稳定性、平均执行时间等性能指标上均更优,更加适应低轨卫星网络环境。
With the rapid expansion of low Earth orbit (LEO) satellite networks and the diversification of their services, how to achieve efficient resource allocation for network slicing in dynamic networking environments has become a critical challenge. To address this problem, this paper proposes a model- based reinforcement learning (MBRL) optimization strategy for radio access network (RAN) slicing resource allocation in LEO satellite networks. We formulate the RAN slicing resource allocation problem as a constrained control problem that requires: the allocation policy to perform online learning within the operational network, while maintaining the service level agreement (SLA) violation rate below a predefined threshold during the learning process. Building upon this formulation, we propose an MBRL-based resource allocation strategy featuring: a kernel method-based classifier for predictive modeling, and a model self-evaluation mechanism for error rate control. Experimental results demonstrate that compared with other reinforcement learning approaches, our strategy shows superior performance across multiple metrics including resource utilization efficiency, service stability, and average execution time, proving better adaptability to the unique characteristics of LEO satellite networks.
[1] | Zou, C., Wang, H., Chang, J., Shao, F., Shang, L. and Li, G. (2022) Optimal Progressive Pitch for Oneweb Constellation with Seamless Coverage. Sensors, 22, Article No. 6302. https://doi.org/10.3390/s22166302 |
[2] | 吴炀, 胡谷雨, 金凤林, 等. 卫星网络组网关键技术[J]. 指挥控制与仿真, 2022, 44(2): 88-100. |
[3] | Ko, H., Lee, J. and Pack, S. (2021) Priority-Based Dynamic Resource Allocation Scheme in Network Slicing. 2021 International Conference on Information Networking (ICOIN), Jeju Island, 13-16 January 2021, 62-64. https://doi.org/10.1109/icoin50884.2021.9333944 |
[4] | Wang, Z., Wei, Y., Yu, F.R. and Han, Z. (2022) Utility Optimization for Resource Allocation in Multi-Access Edge Network Slicing: A Twin-Actor Deep Deterministic Policy Gradient Approach. IEEE Transactions on Wireless Communications, 21, 5842-5856. https://doi.org/10.1109/twc.2022.3143949 |
[5] | Jiang, M., Condoluci, M. and Mahmoodi, T. (2016) Network Slicing Management & Prioritization in 5G Mobile Systems. European Wireless 2016; 22nd European Wireless Conference, Paris, 11-12 October 2016, 1-6. |
[6] | Sun, S., Feng, X., Qin, S., Sun, Y. and Wang, G. (2020) Paired Bid-Based Double Auction Mechanism for RAN Slicing in 5G-and-Beyond System. 2020 IEEE 20th International Conference on Communication Technology (ICCT), Nanning, 28-31 October 2020, 533-538. https://doi.org/10.1109/icct50939.2020.9295796 |
[7] | Yuan, S., Zhang, Y., Qie, W., Ma, T. and Li, S. (2021) Deep Reinforcement Learning for Resource Allocation with Network Slicing in Cognitive Radio Network. Computer Science and Information Systems, 18, 979-999. https://doi.org/10.2298/csis200710055y |
[8] | Wu, W., Dong, J., Sun, Y. and Yu, F.R. (2022) Heterogeneous Markov Decision Process Model for Joint Resource Allocation and Task Scheduling in Network Slicing Enabled Internet of Vehicles. IEEE Wireless Communications Letters, 11, 1118-1122. https://doi.org/10.1109/lwc.2022.3152177 |
[9] | Nassar, A. and Yilmaz, Y. (2022) Deep Reinforcement Learning for Adaptive Network Slicing in 5G for Intelligent Vehicular Systems and Smart Cities. IEEE Internet of Things Journal, 9, 222-235. https://doi.org/10.1109/jiot.2021.3091674 |
[10] | Wu, H., Chen, J., Zhou, C., Li, J. and Shen, X. (2021) Learning-Based Joint Resource Slicing and Scheduling in Space-Terrestrial Integrated Vehicular Networks. Journal of Communications and Information Networks, 6, 208-223. https://doi.org/10.23919/jcin.2021.9549118 |
[11] | Orabona, F., Keshet, J. and Caputo, B. (2009) Bounded Kernel-Based Online Learning. Journal of Machine Learning Research, 10, 2643-2666. |
[12] | Mnih, V., Badia, A.P., Mirza, M., et al. (2016) Asynchronous Methods for Deep Reinforcement Learning. International Conference on Machine Learning, New York, 19-24 June 2016, 1928-1937. |
[13] | Fujimoto, S., Hoof, H. and Meger, D. (2018) Addressing Function Approximation Error in Actor-Critic Methods. International Conference on Machine Learning, Stockholm, 10-15 July 2018, 1587-1596. |
[14] | Schulman, J., Levine, S., Abbeel, P., et al. (2015) Trust Region Policy Optimization. International Conference on Machine Learning, Lille, 7-9 July 2015, 1889-1897. |
[15] | Haarnoja, T., Zhou, A., Abbeel, P., et al. (2018) Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. International Conference on Machine Learning, Stockholm, 10-15 July 2018, 1861-1870. |