Sutton R S,Barto A G.Reinforcement Learning[M].Cambridge:M1T Press,1998.
[2]
Fernández F,Veloso M.Probabilistic policy reuse in a reinforcement learning agent[A J.Proceedings of the Fifth International Conference on Autonomous Agents and Multi-Agent Systems[C].New York:ACM,2006.
[3]
Fernández F,Veloso M.Policy reuse for transfer learning across tasks with different state and action spaces[A].Proceedings of The ICML-06 Workshop on Structural Knowledge Transfer for Machine Learning[C].New York:ACM,2006.
[4]
Bemstein D S.Reusing old policies to accelerate learning on new MDPs[R].Amherst:Amherst College,University of Massachusetts,1999.
[5]
Dietterich T G.Hierarchical reinforcement learning with the MAXQ value function decomposition[J].Journal of Artificial Intelligence Research,2000,13(2):227-303.
[6]
Ravindran B,Barto A G.SMDP homomorphisms:an algebraic approach to abstraction in semi-Markov decision processes[A].Proceeding of the Eighteenth International Joint Conference on Artificial Intelligence[C].San Francisco:Morgan Kaufmann,2003.
[7]
Mahadevan S.Proto-value functions:developmental reinforcement learning[A].Proceedings of the Twenty-second International Conference on Machine learning[C].New York:ACM,2005.
[8]
Sherstove A A,Stone P.Improving action selection in MDP''s via knowledge transfer[A].Proceedings of the Twentieth National Conference on Artificial Intelligence[C].New York:ACM,2005.
[9]
Taylor M E,Kuhlmann G,Stone P.Autonomous transfer for reinforcement learning[A].Proceedings of the Seventeenth International Conference on Autonomous Agents and Multi-Agent Systems[C].Estoril,Portugal:IFAAMAS,2008.
[10]
Bowling M,Veloso M.Reusing learned policies between similar problems[A].Proceedings of AI * IA-98 Workshop on New Trends in Robotics[C].Berlin,Germany:Springer Verlag.1998.
[11]
Pickett M,Barto A G.PolicyBlocks:an algorithm for creating useful macro-actions in reinforcement learning[A].Proceedings of the Nineteenth International Conference on Machine Learning[C].San Francisco:Morgan Kaufmann,2002.506-513.
[12]
Mcgovern A,Barto A G.Automatic discovery of subgoals in reinforcement learning using diverse density[A].Proceedings of the Eighteenth International Conference on Machine Learning[C].San Francisco:Morgan Kaufmann,2001.361-368.
[13]
Mehta N,Natarajan S,Tadepalli P,A Fern.Transfer in variable-reward hierarchical reinforcement learning[A].Proceedings of tile NIPS-05 Workshop on Inductive Transfer[C].Cambridge:MIT Press,2005.360-366.
[14]
Soni V,Singh S.Using homomorphisms to transfer options across continuous reinforcement learning domains[A].Proceedings of the Twenty-first National Conference on Machine Learning[C].Boston:AAAI Press,2006.
[15]
Madden M G,Howley T.Transfer of experience between reinforcement learning environments with progressive difficulty[J].Artificial Intelligence Review,2004,21 (3):375-398.
[16]
Driessens K,Ramon J,Croonenborghs T.Transfer learning for reinforcement learning through goal and policy parameterization[A].Proceedings of the ICML-06 Workshop on Structural Knowledge Transfer for Machine Learning[C].New York:ACM,2006.
[17]
Torrey L,Shavlik J,Walker T,Maclin R.Skill acquisition via transfer learning and advice taking[A].Proceedings of the Seventeenth European Conference on Machine teaming[C].Berlin,Germany:Springer,2006.425-436.
[18]
Anderson J R.Cognitive Psychology and Its Applications(third edition)[M].New York:Freeman,1990.