|
基于LSTM Dueling Double DQN的车联网分布式资源管理算法
|
Abstract:
世界卫生组织(World Health Organization, WHO)指出,为了防止每年造成数百万人死亡的交通事故,车辆与其他车辆、基础设施或行人之间的信息交换(Vehicle-to-Everything, V2X)通信至关重要。V2X使车辆能够广播安全信息,包括其位置、速度、碰撞警告信息,以减少道路事故。且随着无线通信技术的进步和智能驾驶技术的发展,实时交付、低网络成本和通信信息的稳定交付(可靠性)成为各类智能交通系统高速发展的基础和核心。然而,在V2X通信中,数据传输通常是通过一个基站来分配频率和时间资源给车辆进行,传输的稳定性并不能总是得到保证,且车联网中无线网络的通信特征和车辆的移动性,给车联网通信服务带来了巨大挑战,在大规模时态数据分发场景中,传统的分布式服务框架无法仅凭基站服务,满足高发状态下的异构网络资源分配和通信资源共享,因为基站的覆盖范围有限且车辆处于高速移动中,所以服务请求通过路测单元和车辆协作来完成服务。针对以上的问题,本文从服务框架、算法设计的角度,对车联网信息服务技术进行了系统的研究。针对V2X通信环境下的大规模数据分发进行了研究,首先提出了一种分布式资源调度的多智能体时态信息服务框架,并建立了分布式资源分配模型,在此基础上,提出了多智能体强化学习的分布式资源分配算法。通过车辆检网络实时交互,智能决策动态环境下无线资源的预留和重用,使得资源选择能够适应车辆周边环境的动态变化,显著降低了分组碰撞概率。
The World Health Organization (WHO) points out that in order to prevent traffic accidents that cause millions of deaths every year, the Vehicle-to-Everything (V2X) communication between vehicles and other vehicles, infrastructure or pedestrians is essential. V2 enables vehicles to broadcast safety information, including its position, speed and collision warning information, so as to reduce road accidents. With the progress of wireless communication technology and the development of intelligent driving technology, real-time delivery, low network cost and stable delivery of communication information have become the foundation and core of the rapid development of various intelligent transportation systems. However, in V2X communication, data transmission is usually carried out through a base station to allocate frequency and time resources to vehicles, and the stability of transmission cannot always be guaranteed. Moreover, the communication characteristics of wireless network and the mobility of vehicles in vehicle networking bring great challenges to vehicle networking communication services. In large-scale temporal data distribution scenarios, the traditional distributed service framework can’t only rely on base station services to meet the high-incidence heterogeneous network resource allocation and communication resource sharing, because the coverage of base stations is limited and vehicles are moving at high speed. In view of the above problems, this paper systematically studies the information service technology of vehicle networking from the perspective of service framework and algorithm design. The main achievements include the following aspects, the large-scale data distribution in V2X communication environment is studied. Firstly, a multi-agent temporal information service framework for distributed resource scheduling is proposed, and a distributed
[1] | Zhang, Q. and Li, H. (2007) MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition. IEEE Transactions on Evolutionary Computation, 11, 712-731. https://doi.org/10.1109/TEVC.2007.892759 |
[2] | Almeida, J., Alam, M., Ferreira, J. and Oliveira, A.S. (2016) Mitigating Adjacent Channel Interference in Vehicular Communication Systems. Digital Communications and Networks, 2, 57-64. https://doi.org/10.1016/j.dcan.2016.03.001 |
[3] | Bernstein, A.V., Burnaev, E.V. and Kachan, O.N. (2018) Reinforcement Learning for Computer Vision and Robot Navigation. In: Perner, P., Ed., Machine Learning and Data Mining in Pattern Recognition. MLDM 2018. Lecture Notes in Computer Science, Vol. 10935, Springer, Cham, 258-272. https://doi.org/10.1007/978-3-319-96133-0_20 |
[4] | Molina-Masegosa, R., Gozalvez, J. and Sepulcre, M. (2020) Comparison of IEEE 802.11p and LTE-V2X: An Evaluation with Periodic and Aperiodic Messages of Constant and Variable Size. IEEE Access, 8, 121526-121548.
https://doi.org/10.1109/ACCESS.2020.3007115 |
[5] | Zhao, Q., Tong, L., Swami, A. and Chen, Y. (2007) Decen-tralized Cognitive MAC for Opportunistic Spectrum Access in Ad Hoc Networks: A POMDP Framework. IEEE Jour-nal on Selected Areas in Communications, 25, 589-600.
https://doi.org/10.1109/JSAC.2007.070409 |
[6] | Nasir, Y.S. and Guo, D. (2019) Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks. IEEE Journal on Selected Areas in Communications, 37, 2239-2250.
https://doi.org/10.1109/JSAC.2019.2933973 |
[7] | Cui, J., Liu, Y. and Nallanathan, A. (2019) Multi-Agent Rein-forcement Learning-Based Resource Allocation for UAV Networks. IEEE Transactions on Wireless Communications, 19, 729-743. https://doi.org/10.1109/TWC.2019.2935201 |
[8] | Wijesiri N.B.A., G.P., Haapola, J. and Samara-singhe, T. (2019) A Markov Perspective on C-V2X Mode 4. 2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall), Honolulu, 22-25 September 2019, 1-6.
https://doi.org/10.1109/VTCFall.2019.8891331 |
[9] | Jeon, Y., Kuk, S. and Kim, H. (2018) Reducing Message Col-lisions in Sensing-Based Semi-Persistent Scheduling (SPS) by Using Reselection Lookaheads in Cellular V2X. Sensors, 18, Article No. 4388. https://doi.org/10.3390/s18124388 |
[10] | Honnaiah, P.J., Maturo, N. and Chatzinotas, S. (2020) Foreseeing Semi-Persistent Scheduling in Mode-4 for 5G Enhanced V2X Communication. 2020 IEEE 17th An-nual Consumer Communications & Networking Conference (CCNC), Las Vegas, 10-13 January 2020, 1-2. https://doi.org/10.1109/CCNC46108.2020.9045276 |
[11] | Heo, S., Yoo, W., Jang, H. and Chung, J.-M. (2021) H-V2X Mode 4 Adaptive Semipersistent Scheduling Control for Cooperative Internet of Vehicles. IEEE Internet of Things Journal, 8, 10678-10692.
https://doi.org/10.1109/JIOT.2020.3048993 |
[12] | Bonjorn, N., Foukalas, F. and Pop, P. (2018) Enhanced 5G V2X Services Using Sidelink Device-to-Device Communications. 2018 17th Annual Mediterranean Ad Hoc Networking Workshop (Med-Hoc-Net), Capri, 20-22 June 2018, 1-7.
https://doi.org/10.23919/MedHocNet.2018.8407085 |
[13] | 余翔, 陈晓东, 王政, 石雪琴. 基于LTE-V2X的车联网资源分配算法[J]. 计算机工程, 2021, 47(2): 188-193. |
[14] | 金久一, 邱恭安. C-V2X通信中资源分配与功率控制联合优化[J]. 计算机工程, 2020, 47(10): 147-152. |
[15] | Liang, L., Ye, H. and Li, G.Y. (2019) Spectrum Sharing in Vehicular Networks Based on Multi-Agent Reinforcement Learning. IEEE Journal on Selected Areas in Communica-tions, 37, 2282-2292.
https://doi.org/10.1109/JSAC.2019.2933962 |
[16] | Gupta, J.K., Egorov, M. and Kochenderfer, M. (2017) Coopera-tive Multi-Agent Control Using Deep Reinforcement Learning. In: Sukthankar, G. and Rodriguez-Aguilar, J., Eds., Au-tonomous Agents and Multiagent Systems. AAMAS 2017. Lecture Notes in Computer Science, Vol. 10642, Springer, Cham, 66-83.
https://doi.org/10.1007/978-3-319-71682-4_5 |
[17] | Bazzi, A., Cecchini, G., Menarini, M., Masini, B.M. and Zanel-la, A. (2019) Survey and Perspectives of Vehicular Wi-Fi versus Sidelink Cellular-V2X in the 5G Era. Future Internet, 11, Article No. 122.
https://doi.org/10.3390/fi11060122 |
[18] | (2011) 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access Network; Evolved Universal Terrestrial Radio Ac-cess (E-UTRA); Physical Layer Procedures.
http://www.arib.or.jp/english/html/overview/doc/STD-T104v1_30/5_Appendix/Rel10/36/36213-a60.pdf |
[19] | Zhang, K., Yang, Z. and Ba?ar, T. (2021) Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algo-rithms. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L. and Cansever, D., Eds., Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, Vol. 325, Springer, Cham, 321-384.
https://doi.org/10.1007/978-3-030-60990-0_12 |
[20] | Hernandez-Leal, P., Kaisers, M., Baarslag, T. and de Cote, E.M. (2017) A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity. ArXiv Preprint ArXiv: 1707.09183. |
[21] | Naparstek, O. and Cohen, K. (2018) Deep Multi-User Reinforcement Learning for Distributed Dy-namic Spectrum Access. IEEE Transactions on Wireless Communications, 18, 310-323. https://doi.org/10.1109/TWC.2018.2879433 |
[22] | Schroeder de Witt, C., Foerster, J., Farquhar, G., et al. (2019) Multi-Agent Common Knowledge Reinforcement Learning. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E. and Garnett, R., Eds., Advances in Neural Information Processing Systems 32 (NeurIPS 2019), Curran Associates, Inc., Red Hook. |
[23] | Van Hasselt, H., Guez, A. and Silver, D. (2016) Deep Reinforcement Learn-ing with Double Q-Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 30, 2094-2100. https://doi.org/10.1609/aaai.v30i1.10295 |
[24] | Foerster, J., Assael, I.A., De Freitas, N. and Whiteson, S. (2016) Learning to Communicate with Deep Multi-Agent Reinforcement Learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I. and Garnett, R., Eds., Advances in Neural Information Processing Systems 29 (NIPS 2016), Curran Associates, Inc., Red Hook. |
[25] | Li, M., Lu, F., Zhang, H. and Chen, J. (2020) Predicting Future Locations of Moving Objects with Deep Fuzzy-LSTM Networks. Transportmetrica A: Transport Science, 16, 119-136.
https://doi.org/10.1080/23249935.2018.1552334 |
[26] | Feng, J., Li, Y., Zhang, C., et al. (2018) Deepmove: Predict-ing Human Mobility with Attentional Recurrent Networks. Proceedings of the 2018 World Wide Web Conference, Lyon, 23-27 April 2018, 1459-1468.
https://doi.org/10.1145/3178876.3186058 |
[27] | Sukhbaatar, S. and Fergus, R. (2016) Learning Multiagent Commu-nication with Backpropagation. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I. and Garnett, R., Eds., Advances in Neural Information Processing Systems 29 (NIPS 2016), Curran Associates, Inc., Red Hook. |