全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
电子学报  2008 

强化学习中的迁移:方法和进展

, PP. 39-43

Keywords: 迁移学习,强化学习,知识,行为,认知心理学,抽象,泛化

Full-Text   Cite this paper   Add to My Lib

Abstract:

传统机器学习方法认为不同的学习任务彼此无关,但事实上不同的学习任务常常相互关联.迁移学习试图利用任务之间的联系,利用过去的学习经验加速对于新任务的学习.机器学习各分支都已展开了对迁移学习的研究.本文综述了强化学习的迁移技术,依据认知心理学的理论将现有技术分为行为迁移和知识迁移两大类,并介绍、分析了各自的特点,并提出了一些开放性的问题.

References

[1]  Sutton R S,Barto A G.Reinforcement Learning[M].Cambridge:M1T Press,1998.
[2]  Fernández F,Veloso M.Probabilistic policy reuse in a reinforcement learning agent[A J.Proceedings of the Fifth International Conference on Autonomous Agents and Multi-Agent Systems[C].New York:ACM,2006.
[3]  Fernández F,Veloso M.Policy reuse for transfer learning across tasks with different state and action spaces[A].Proceedings of The ICML-06 Workshop on Structural Knowledge Transfer for Machine Learning[C].New York:ACM,2006.
[4]  Bemstein D S.Reusing old policies to accelerate learning on new MDPs[R].Amherst:Amherst College,University of Massachusetts,1999.
[5]  Dietterich T G.Hierarchical reinforcement learning with the MAXQ value function decomposition[J].Journal of Artificial Intelligence Research,2000,13(2):227-303.
[6]  Ravindran B,Barto A G.SMDP homomorphisms:an algebraic approach to abstraction in semi-Markov decision processes[A].Proceeding of the Eighteenth International Joint Conference on Artificial Intelligence[C].San Francisco:Morgan Kaufmann,2003.
[7]  Mahadevan S.Proto-value functions:developmental reinforcement learning[A].Proceedings of the Twenty-second International Conference on Machine learning[C].New York:ACM,2005.
[8]  Sherstove A A,Stone P.Improving action selection in MDP''s via knowledge transfer[A].Proceedings of the Twentieth National Conference on Artificial Intelligence[C].New York:ACM,2005.
[9]  Taylor M E,Kuhlmann G,Stone P.Autonomous transfer for reinforcement learning[A].Proceedings of the Seventeenth International Conference on Autonomous Agents and Multi-Agent Systems[C].Estoril,Portugal:IFAAMAS,2008.
[10]  Bowling M,Veloso M.Reusing learned policies between similar problems[A].Proceedings of AI * IA-98 Workshop on New Trends in Robotics[C].Berlin,Germany:Springer Verlag.1998.
[11]  Pickett M,Barto A G.PolicyBlocks:an algorithm for creating useful macro-actions in reinforcement learning[A].Proceedings of the Nineteenth International Conference on Machine Learning[C].San Francisco:Morgan Kaufmann,2002.506-513.
[12]  Mcgovern A,Barto A G.Automatic discovery of subgoals in reinforcement learning using diverse density[A].Proceedings of the Eighteenth International Conference on Machine Learning[C].San Francisco:Morgan Kaufmann,2001.361-368.
[13]  Mehta N,Natarajan S,Tadepalli P,A Fern.Transfer in variable-reward hierarchical reinforcement learning[A].Proceedings of tile NIPS-05 Workshop on Inductive Transfer[C].Cambridge:MIT Press,2005.360-366.
[14]  Soni V,Singh S.Using homomorphisms to transfer options across continuous reinforcement learning domains[A].Proceedings of the Twenty-first National Conference on Machine Learning[C].Boston:AAAI Press,2006.
[15]  Madden M G,Howley T.Transfer of experience between reinforcement learning environments with progressive difficulty[J].Artificial Intelligence Review,2004,21 (3):375-398.
[16]  Driessens K,Ramon J,Croonenborghs T.Transfer learning for reinforcement learning through goal and policy parameterization[A].Proceedings of the ICML-06 Workshop on Structural Knowledge Transfer for Machine Learning[C].New York:ACM,2006.
[17]  Torrey L,Shavlik J,Walker T,Maclin R.Skill acquisition via transfer learning and advice taking[A].Proceedings of the Seventeenth European Conference on Machine teaming[C].Berlin,Germany:Springer,2006.425-436.
[18]  Anderson J R.Cognitive Psychology and Its Applications(third edition)[M].New York:Freeman,1990.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133