OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

公路交通科技 2014

平均排队长度差最小的单交叉口在线Q学习模型

, PP. 116-122

卢守峰,张术,刘喜敏

Keywords: 交通工程,在线Q学习,配时优化,排队长度

Full-Text Cite this paper Add to My Lib

Abstract:

为改善交叉口排队长度管理,避免交叉口某个方向排队长度过长,采用强化学习理论建立了以平均排队长度差最小为优化目标的在线Q学习模型。针对控制性能指标相对于邻近的配时方案不敏感的特点,提出了以平均排队长度差作为基本单位重新构造奖励函数,目的是拉大各行为对应的Q值差距,提高模型的收敛速度和鲁棒性。集成ExcelVBA,Vissim,Matlab建立了在线仿真平台,作为计算环境对算例进行了计算。算例中利用GPS数据对Vissim软件中车辆加减速度曲线进行了标定。计算结果表明以平均排队长度差作为优化目标能够提高各个方向排队长度的平衡性,优化整个交叉口的时空资源;建立的在线Q模型具有学习能力和较快的计算速度,模型能否收敛受到周期取值和可选行为数量的影响。

References

[1]	王国林,萧德云.一种面向全景视频的交通状态检测方法[J],清华大学学报:自然科学版,2011,51(1):30-35.WANG Guo-lin,XIAO De-yun.Traffic State Detection Method for Full Scene Video[J]. Journal of Tsinghua University:Science and Technology Edition,2011,51(1):30-35.
[2]	OLIVEIRA D D,BAZZAN A L C,SILVA B C D,et al.Reinforcement Learning Based Control of Traffic Lights in Non-stationary Environments: A Case Study in A Microscopic Simulator[C] //Proceedings of the 4th European Workshop on Multi-agent Systems (EUMAS06). Lisbon: RWTH Aachen University,2006: 31-42.
[3]	ILVA B,OLIVEIRA D,BAZZAN A,et al. Adaptive Traffic Control with Reinforcement Learning[C] //Proceedings of the 4th Workshop on Agents in Traffic and Transportation,Hakodate: Association for Computing Machinery,2006: 80-86.
[4]	CAI C,WONG C K,HEYDECKER B G. Adaptive Traffic Signal Control Using Approximate Dynamic Programming[J]. Transportation Research Part C: Emerging Technologies,2009,17(5):456-474.
[5]	WIERING M,VEENEN J V,VREEKEN J,et al.Intelligent Traffic Light Control,UU-CS-2004-029[R].Utrecht: Institute of Information and Computing Sciences, 2004.
[6]	ABDULHAI B,PRINGLE R,KARAKOULAS G J. Reinforcement Learning for True Adaptive Traffic Signal Control[J]. Journal of Transportation Engineering,2003,129(3):278-285.
[7]	LPRASHANTH L A,BHATNAGAR S. Reinforcement Learning with Function Approximation for Traffic Signal Control[J]. IEEE Transactions on Intelligent Transportation Systems,2011,12(2): 412-421.
[8]	BINGHAM E. Reinforcement Learning in Neurofuzzy Traffic Signal Control[J]. European Journal of Operational Research,2001,131(2):232-241.
[9]	马寿峰,李英,刘豹.一种基于Agent的单路口交通信号学习控制方法[J],系统工程学报,2002,17(6): 526-530.MA Shou-feng,LI Ying,LIU Bao. Agent-based Learning Control Method for Urban Traffic Signal of Single Intersection[J]. Journal of Systems Engineering,2002,17(6): 526-530.
[10]	承向军,常歆识,杨肇夏.基于Q学习的交通信号控制方法[J],系统工程理论与实践,2006,26(8): 136-140. CHENG Xiang-jun,CHANG Xin-shi,YANG Zhao-xia. A Traffic Signal Control Method Based on Q-Learning[J]. Systems Engineering-Theory and Practice,2006,26(8):136-140.
[11]	赵晓华,石建军,李振龙,等.基于Q-learning 和BP 神经元网络的交叉口信号灯控制[J],公路交通科技,2007,24(7):99-102. ZHAO Xiao-hua,SHI Jian-jun,LI Zhen-long,et al. Traffic Signal Control Based on Q-learning and BP Neural Network[J]. Journal of Highway and Transportation Research and Development,2007,24(7): 99-102.
[12]	赵晓华,李振龙,陈阳舟,等.基于混杂系统Q学习最优控制的信号灯控制方法[J],高技术通讯,2007,5(17): 498-502.ZHAO Xiao-hua,LI Zhen-long,CHEN Yang-zhou,et al. An Optimal Control Method for Hybrid Systems Based on Q-learning for an Intersection Traffic Signal Control[J]. Chinese High Technology Letters,2007,5(17): 498-502.
[13]	卢守峰,邵维,韦钦平,等.基于绿灯时间等饱和度的离线Q学习配时优化模型[J],系统工程,2012,30(7):117-122.LU Shou-feng,SHAO Wei,WEI Qin-ping,et al.Optimization Model of the Off-line Q Learning Timing Based on Green Time Equi-saturation[J]. Systems Engineering,2012,30(7):117-122.
[14]	卢守峰,韦钦平,刘喜敏.单交叉口信号配时的离线Q学习模型研究[J],控制工程,2012,19(6):987-992.LU Shou-feng,WEI Qin-ping,LIU Xi-min.The Study on Off-line Q-learning Model for Single Intersection Signal Timing[J]. Control Engineering of China,2012,19(6):987-992.
[15]	BOX S,WATERSON B. An Automated Signalized Junction Controller that Learns Strategies from a Human Expert[J]. Engineering Applications of Artificial Intelligence,2012,25(1):107-118.
[16]	BOX S,WATERSON B. An Automated Signalized Junction Controller that Learns Strategies by Temporal Difference Reinforcement Learning[J]. Engineering Applications of Artificial Intelligence,2013,26(1):652-659.
[17]	WATKINS C J C H,DAYAN P. Q-learning[J]. Machine Learning,1992,8(3-4): 279-292.
[18]	SUTTON R,BARTO A. Reinforcement Learning: An Introduction[M].Cambridge: MIT Press,1998.
[19]	卢守峰,韦钦平,沈文,等.集成Vissim、Excel VBA、Matlab的仿真平台研究[J]. 交通运输系统工程与信息,2012,12(4):43-48.LU Shou-feng,WEI Qin-ping,SHEN Wen,et al. Integrated Simulation Platform of VISSIM,Excel VBA,MATLAB[J]. Journal of Transportation Systems Engineering and Information Technology,2012,12(4):43-48.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133