%0 Journal Article
%T 基于强化学习的低轨卫星网络动态分布式路由算法研究<br>Research on Dynamic Distributed Routing Algorithm for Low Earth Orbit Satellite Networks Based on Reinforcement Learning
%A 訾鑫源
%A 刘健培
%A 邝坚
%J Computer Science and Application
%P 592-605
%@ 2161-881X
%D 2025
%I Hans Publishing
%R 10.12677/csa.2025.155132
%X 随着全球通信需求的快速增长&#65292;低轨卫星网络凭借其覆盖范围广、灵活性高的优势成为地面通信的有力补充。然而&#65292;卫星的高速运动导致网络拓扑和链路状态动态变化&#65292;为路由算法设计带来巨大挑战。为应对上述问题&#65292;本文提出一种基于强化学习的动态分布式路由算法。首先系统性建模卫星网络通信过程&#65292;涵盖网络拓扑、通信链路、通信时延及丢包等要素&#65307;其次&#65292;提出动态分布式路由架构&#65292;引入星间状态交换机制&#65292;使卫星节点能够实时感知网络状态变化并自主决策&#65307;然后&#65292;将路由决策问题建模为分布式部分可观察马尔可夫决策过程(Dec-POMDP)&#65292;提出结合Double DQN和Dueling DQN的MAD3QN路由算法&#65292;构建高效状态表示和奖励函数&#65292;有效引导智能体优化路由决策。仿真结果表明&#65292;与其他路由算法相比&#65292;MAD3QN算法在端到端时延、丢包率、吞吐量等性能指标上均表现更优&#65292;充分证明了该算法对低轨卫星网络高动态环境的适应性与有效性。<br>With the rapid growth of global communications demands, low Earth orbit (LEO) satellite networks have become a powerful supplement to ground communications due to their wide coverage and high flexibility. However, the high-speed movement of satellites leads to dynamic changes in network topology and link status, posing significant challenges for routing algorithm design. To address these issues, this paper proposes a dynamic distributed routing algorithm based on reinforcement learning. First, we systematically model the satellite network communication process, covering factors such as network topology, communication links, communication delay, and packet loss. Second, we introduce a dynamic distributed routing architecture and a mechanism for inter-satellite state exchange, enabling satellite nodes to perceive network state changes in real time and make autonomous decisions. Next, we model the routing decision problem as a Distributed Partially Observable Markov Decision Process (Dec-POMDP) and propose the MAD3QN routing algorithm, which combines Double DQN and Dueling DQN. The algorithm constructs an efficient state representation and reward function to effectively guide the agent in optimizing routing decisions. Simulation results show that compared to other routing algorithms, the MAD3QN algorithm outperforms in performance metrics such as end-to-end delay, packet loss rate, and throughput, demonstrating its adaptability and effectiveness in the highly dynamic environment of LEO satellite networks.
%K 低轨卫星网络&#65292
%K 分布式路由&#65292
%K 强化学习<br>Low Earth Orbit Satellite Network
%K Distributed Routing
%K Reinforcement Learning
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=114900