全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于强化学习的网络演化博弈合作行为研究
The Study of Cooperative Behaviour in Network Evolutionary Games Based on Reinforcement Learning

DOI: 10.12677/orf.2024.143340, PP. 1073-1085

Keywords: 网络演化博弈,合作,SARSA算法,策略选择机制
Network Evolution Game
, Cooperation, SARSA Algorithm, Strategy Selection Mechanism

Full-Text   Cite this paper   Add to My Lib

Abstract:

强化学习因其具有自学习和在线学习能力的特点,日渐成为学者研究演化博弈的重要工具。本文将SARSA算法(State-Action-Reward-State-Action)引入网络博弈中,提出一种基于SARSA算法的演化博弈模型,采用三种强化学习决策机制在四种网络拓扑结构上进行数值仿真模拟。实验表明,引入算法后能明显提高网络中个体的合作水平并且会稳定维持在一个区间范围内。此外,还探讨了算法不同的参数设置、收益矩阵的异质性和个体全局属性对网络合作的影响,结果显示,在学习率较低和折扣率较高以及个体收益适中时对个体间的合作有较好的促进作用。
Reinforcement learning is increasingly becoming an important tool for scholars to study evolutionary games due to its features of self-learning and online learning ability. In this paper, the SARSA algorithm (State-Action-Reward-State-Action) is introduced into the network game, and an evolutionary game model based on the SARSA algorithm is proposed, and numerical simulations are conducted on four network topologies using three reinforcement learning decision-making mechanisms. Experiments show that the introduction of the algorithm can significantly improve the level of cooperation of individuals in the network and will be stably maintained in an interval range. In addition, the effects of different parameter settings of the algorithms, the heterogeneity of the payoff matrices and the global attributes of the individuals on the network cooperation are also explored, and the results show that there is a better facilitation of cooperation among individuals at lower learning rates and higher discount rates as well as moderate individual payoffs.

References

[1]  Fowler, J.H. and Christakis, N.A. (2010) Cooperative Behavior Cascades in Human Social Networks. Proceedings of the National Academy of Sciences, 107, 5334-5338.
https://doi.org/10.1073/pnas.0913149107
[2]  Apicella, C.L., Marlowe, F.W., Fowler, J.H. and Christakis, N.A. (2012) Social Networks and Cooperation in Hunter-Gatherers. Nature, 481, 497-501.
https://doi.org/10.1038/nature10736
[3]  Helbing, D. and Johansson, A. (2010) Evolutionary Dynamics of Populations with Conflicting Interactions: Classification and Analytical Treatment Considering Asymmetry and Power. Physical Review E, 81, Article ID: 016112.
https://doi.org/10.1103/physreve.81.016112
[4]  Wang, Z., Jusup, M., Wang, R., Shi, L., Iwasa, Y., Moreno, Y., et al. (2017) Onymity Promotes Cooperation in Social Dilemma Experiments. Science Advances, 3, e1601444.
https://doi.org/10.1126/sciadv.1601444
[5]  Martinez-Vaquero, L., Gruji?, J. and Lenaerts, T. (2016) Equivalence of Cooperation Indexes: Comment on “Universal Scaling for the Dilemma Strength in Evolutionary Games” by Z. Wang et al. Physics of Life Reviews, 16, 196-197.
https://doi.org/10.1016/j.plrev.2015.07.005
[6]  Krivan, V. (2009) Evolutionary Games and Population Dynamics. Proceedings of Seminar in Differential Equations, Vol. 2, 223-233.
[7]  荣智海, 吴枝喜, 王文旭. 共演博弈下网络合作动力学研究进展[J]. 电子科技大学学报, 2013(1): 10-22.
[8]  Szabó, G. and Bunth, G. (2018) Social Dilemmas in Multistrategy Evolutionary Potential Games. Physical Review E, 97, Article ID: 012305.
https://doi.org/10.1103/physreve.97.012305
[9]  Amaral, M.A., Wardil, L., Perc, M. and da Silva, J.K.L. (2016) Stochastic Win-Stay-Lose-Shift Strategy with Dynamic Aspirations in Evolutionary Social Dilemmas. Physical Review E, 94, Article ID: 032317.
https://doi.org/10.1103/physreve.94.032317
[10]  Szolnoki, A. and Perc, M. (2016) Leaders Should Not Be Conformists in Evolutionary Social Dilemmas. Scientific Reports, 6, Article No. 23633.
https://doi.org/10.1038/srep23633
[11]  Santos, F.C., Pacheco, J.M. and Lenaerts, T. (2006) Evolutionary Dynamics of Social Dilemmas in Structured Heterogeneous Populations. Proceedings of the National Academy of Sciences, 103, 3490-3494.
https://doi.org/10.1073/pnas.0508201103
[12]  Zhang, Z., Wang, X., Su, C. and Sun, L. (2022) Evolutionary Game Analysis of Shared Manufacturing Quality Synergy under Dynamic Reward and Punishment Mechanism. Applied Sciences, 12, 6792.
https://doi.org/10.3390/app12136792
[13]  Gong, Y., Liu, S. and Bai, Y. (2020) Reputation-Based Co-Evolutionary Model Promotes Cooperation in Prisoner’s Dilemma Game. Physics Letters A, 384, Article ID: 126233.
https://doi.org/10.1016/j.physleta.2020.126233
[14]  Pei, H., Yan, G. and Wang, H. (2021) Reciprocal Rewards Promote the Evolution of Cooperation in Spatial Prisoner’s Dilemma Game. Physics Letters A, 390, Article ID: 127108.
https://doi.org/10.1016/j.physleta.2020.127108
[15]  Nag Chowdhury, S., Kundu, S., Duh, M., Perc, M. and Ghosh, D. (2020) Cooperation on Interdependent Networks by Means of Migration and Stochastic Imitation. Entropy, 22, Article No. 485.
https://doi.org/10.3390/e22040485
[16]  Arefin, M.R., Tatsukawa, Y. and Tanimoto, J. (2023) Evolution of Cooperation under the Coexistence of Imitation and Aspiration Dynamics in Structured Populations. Nonlinearity, 36, 2286-2309.
https://doi.org/10.1088/1361-6544/acc500
[17]  Hu, X. and Liu, X. (2021) Unfixed-Neighbor-Mechanism Promotes Cooperation in Evolutionary Snowdrift Game on Lattice. Physica A: Statistical Mechanics and Its Applications, 572, Article ID: 125910.
https://doi.org/10.1016/j.physa.2021.125910
[18]  徐小琼, 周朝荣, 马小霞, 等. 容迟网络中基于演化博弈的合作行为[J]. 计算机应用, 2016, 36(2): 483-487.
[19]  Wang, H., Liu, N., Zhang, Y., Feng, D., Huang, F., Li, D., et al. (2020) Deep Reinforcement Learning: A Survey. Frontiers of Information Technology & Electronic Engineering, 21, 1726-1744.
https://doi.org/10.1631/fitee.1900533
[20]  Nowak, M.A. and May, R.M. (1992) Evolutionary Games and Spatial Chaos. Nature, 359, 826-829.
https://doi.org/10.1038/359826a0
[21]  Lu, S., Zhu, G. and Zhang, L. (2023) The Promoting Effect of Adaptive Persistence Aspiration on the Cooperation Based on the Consideration of Payoff and Environment in Prisoner’s Dilemma Game. Biosystems, 226, Article ID: 104868.
https://doi.org/10.1016/j.biosystems.2023.104868
[22]  Ohdaira, T. (2024) The Universal Probabilistic Reward Based on the Difference of Payoff Realizes the Evolution of Cooperation. Chaos, Solitons & Fractals, 182, Article ID: 114754.
https://doi.org/10.1016/j.chaos.2024.114754

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133