WOLF R,HEISENBERG M.Basic organization of operantbehavior as revealed in drosophila flight orientation[J].Comp Physiol,1991,169:699-705.
[2]
阮晓钢.神经计算科学[M].北京:国防工业出版社,2006:553-596.
[3]
王瑞霞,孙亮,阮晓钢.基于内部回归神经网络的强化学习[J].控制工程,2005,12(2):138-140.WANG Rui-xia,SUN Liang,RUAN Xiao-gang.Reinforcement learning based on internally recurrent net[J].Control Engineering of China,2005,12(2):138-140.(in Chinese)
[4]
RAPHAEL B.The robot‘Shakey’and‘his’successors[J].Computers and People,1976,25:7-21.
[5]
BROOKS R A.From earwigs to humans[J].Robotics andAutonomous Systems,1997,20:291-304.
[6]
TOURETZKY D S,SASKIDA L M.Operant conditioningin Skinnerbots[J].Adaptive Behavior,1997,5(3/4):219-47.
[7]
ZALAMA E,GOMEZ J,PAUL M,et al.Adaptivebehavior navigation of a mobile robot[J].IEEETransactions on Systems,Man,and Cybernetics-part A:Systems and Humans,2002,32(1):160-169.
[8]
DOMINGUEZ S,ZALAMA E.Robot learning in a socialrobot[J].Lecture Notes in Comuter Science,2006,4095:691-702.
[9]
HINTON G E,SEJNOWSKI T J,ACKLEY D H.Boltzmann machines:constraint satisfaction networks thatlearn[R]∥Mellon University Technical Report.Pitsburgh:CMU,1984:1-37.
[10]
HINTON G E,SEJNOWSKI T J.Learning and relearningin Boltzmann machines parallel distributed pressing[M].Cambridge:MIT Press,1986:282-317.
[11]
GUO Mao-zu,LIU Yang,JACEK M.A new Q-learningalgorithm based on the metropolis criterion[J].IEEETransactions on Systems,Man,and Cybernetics-part B:Cybernetics,2004,34(5):2140-2143.
[12]
DAHMANI Y,BENYETTOU A.Seek of an optimal wayby Q-learning[J].Journal of Computer Science,2005,1(1):28-30.
[13]
NAOYUKI K,HIROYUKI K.An utterance system of apartner robot based on interaction and perception[J].Word Automation Congress(WAC),2006,6:236-241.
[14]
阮晓钢,任红格.两轮自平衡机器人动力学建模及其平衡控制[J].计算机应用研究,2009,26(1):99-101.RUAN Xiao-gang,REN Hong-ge.Two-wheeled self-balancing mobile robot dynamic model and balancingcontrol[J].Application Research of Computer,2009,26(1):99-101.(in Chinese)