Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge, USA: MIT Press, 1998
[2]
Landelius T. Reinforcement Learning and Distributed Local Model Synthesis. PhD Dissertation. Department of Electrical Engineering, Linkoping University, Linkoping, Sweden, 1997
[3]
Xu X, He H G, Hu D W. Efficient Reinforcement Learning Using Recursive Least-Squares Methods. Journal of Artificial Intelligence Research, 2002, 16: 259-292
[4]
Wen F, Chen Z H, Wang A Q. An Improvement to Fast-AHC Algorithm. Information and Control, 2004, 32 (7): 652-656
[5]
Werbos P J. Stable Adaptive Control Using New Critic Designs. 1998. http://arxiv.org/html/adap-org/ 9810001
[6]
Tsitsikilis J N, Roy B V. An Analysis of Temporal -Difference Learning with Function Approximation. IEEE Trans on Automatic Control, 1997, 42(5): 674-690
[7]
Bradtke S J. Incremental Dynamic Programming for On-Line Adaptive Optimal Control. PhD Dissertation. Department of Computer Science, University of Massachusetts, Amherst, USA, 1994
[8]
Boyan J. Least-Squares Temporal Difference Learning. In: Bratko I, Dzeroski S, eds. Proc of the 16th International Conference on Machine Learning. San Francisco, USA: Morgan Kaufmann, 1999, 49-56
[9]
Goodwin G C, Sin K S. Adaptive Filtering Prediction and Control. Englewood Cliffs, USA: Prentice-Hall, 1984