{4}Cronje W B. Optimization model for isolated signalized traffic intersections. Transportation Research record, 1983, 905: 80-83.
[2]
{5}Elahi S M, Radwan A E, and Goul K M. Knowledge-based system for adaptivetraffic signal control. Transportation Research record, 1991, 1324: 115-122.
[3]
{9}赵欢. 单信号交叉口绿灯时间无模型自适应控制. 硕士学位论文,北京交通大学, 2008.(ZHAO H. Single intersection Timing plan Based on Model Free Adaptive Control[D],Beijing:Beijing Jiaotong University,2010:33-41)
[4]
{12}Xu J, Yu W S, and Wang F Y. Ramp metering based on adaptive critic designs[C]. Proceedings 9th International IEEE Conference on Intelligent Transportation Systems, Toronto, Canada, 2006 : 1531-1536.
[5]
{15}Seong C Y, Widrow B. Neural Dynamic Optimization for ControlSystems. IEEE Transactions on Systems, Man, Cybernetics, 2001,31(4): 482-489.
[6]
{16}Werbos P J. Consisteney of HDP applied to a simple reinforcement learning problem[J]. Neural Networks, 1990, 3(22):179-189.
[7]
{17}Saeks R E, Cox C J, Neidhoefer J C, et al. Neural adaptive control of LoFLYTE[C]. Proceedings of American Control Conference, 2001, 2913-2917.
[8]
{20}Watkins C C, Dayan P. Q-learning[M]. Machine Learning, 1992, 8: 279-292.
[9]
{21}Werbos P J. Back propagation through time: What it is and how to do it[C]. Proceedings of the IEEE, 1990,78(10): 1550-1560.
[10]
{1}Webster F V. Traffic signal settings. Roads Research Laboratory,London, U.K., Road Research Tec, 1958, 39.
[11]
{2}May A D. Traffic flow theory - the traffic engineer's challenge. Proceedings of the Institute of Transportation Engineers, 1965, 290-303.
[12]
{3}Allsop R E. Delay at a fixed time traffic signal I: theoretical analysis. Transportation Science, 1972, 6(3): 260-285.
[13]
{6}Michalopoulos P G, Stephanopoulos G. Optimum control of oversaturated intersections: Theoretical and practical consideration. Traffic Engineering and Control, 1978, 19(5): 216-221.
[14]
{7}Chang T H, Lin J Y. Optimal signal timing for an oversaturated intersection. Transportation Research Part B, 2000, 34: 471-491.
[15]
{8}许静. 基于神经网络最优化的交通控制技术. 博士学位论文,中国科学院自动化所研究院, 2007.(XU J.Traffic Control Based on NeuralOptimization[D],Beijing:Institute of Automation, Chinese Academy of
[16]
Sciences,2007:55-56)
[17]
{10}贾琰. 基于近似动态规划的交通控制算法的研究. 硕士学位论文, 北京交通大学, 2007.(JIA Y. Traffic control algorithm based on approximate dynamic program ming[D],Beijing:Beijing Jiaotong University,2008:38-40)
[18]
{11}Bai X R, Zhao D B, Yi J Q and Xu J, Ramp Metering Based on On-line ADHDP ($\lambda$) Controller[C],proceeding of 2008 International Joint Conference on Neural Networks(IJCNN 2008), 1847-1852, 2008.
[19]
{13}Bertsekas D P. Dynamic Programming: Deterministic and Stochastic Models[M]. Prentice-Hall, Inc. Upper Saddle River, NJ,USA, 1987.
[20]
{14}Widrow B, Gupta N, and Maitra S. Punish/reward: Learning with a critic in adaptive threshold systems[J]. IEEE Transactions on Systems, Man, Cybernetics, 1973, 3(5): 455-465.
[21]
{18}Liu D R, Zhang H G. A Neural Dynamic Programming Approach for Learning Control of Failure Avoidance Problems[J]. International Journal of intelligent Contorl and Systems,2005, 10(1): 21-32.
[22]
{19}Barto A G, Sutton R S, Anderson C W. Neuronlike adaptive Elements that Can Solve Diffieult Learning Control Problems[J].IEEE Transactions on Systems, Man and Cybernetics,1983, 13(5): 835 - 846.
[23]
{22}Hagan M T, Demut H B, Beale M. Neural network design[M]. 北京: 机械工业出版社(英文版), 2002.