OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

电子学报 2008

强化学习中的迁移:方法和进展

, PP. 39-43

王皓,高阳,陈兴国

Keywords: 迁移学习,强化学习,知识,行为,认知心理学,抽象,泛化

Full-Text Cite this paper Add to My Lib

Abstract:

传统机器学习方法认为不同的学习任务彼此无关,但事实上不同的学习任务常常相互关联.迁移学习试图利用任务之间的联系,利用过去的学习经验加速对于新任务的学习.机器学习各分支都已展开了对迁移学习的研究.本文综述了强化学习的迁移技术,依据认知心理学的理论将现有技术分为行为迁移和知识迁移两大类,并介绍、分析了各自的特点,并提出了一些开放性的问题.

References

[1]	Sutton R S,Barto A G.Reinforcement Learning[M].Cambridge:M1T Press,1998.
[2]	Fernández F,Veloso M.Probabilistic policy reuse in a reinforcement learning agent[A J.Proceedings of the Fifth International Conference on Autonomous Agents and Multi-Agent Systems[C].New York:ACM,2006.
[3]	Fernández F,Veloso M.Policy reuse for transfer learning across tasks with different state and action spaces[A].Proceedings of The ICML-06 Workshop on Structural Knowledge Transfer for Machine Learning[C].New York:ACM,2006.
[4]	Bemstein D S.Reusing old policies to accelerate learning on new MDPs[R].Amherst:Amherst College,University of Massachusetts,1999.
[5]	Dietterich T G.Hierarchical reinforcement learning with the MAXQ value function decomposition[J].Journal of Artificial Intelligence Research,2000,13(2):227-303.
[6]	Ravindran B,Barto A G.SMDP homomorphisms:an algebraic approach to abstraction in semi-Markov decision processes[A].Proceeding of the Eighteenth International Joint Conference on Artificial Intelligence[C].San Francisco:Morgan Kaufmann,2003.
[7]	Mahadevan S.Proto-value functions:developmental reinforcement learning[A].Proceedings of the Twenty-second International Conference on Machine learning[C].New York:ACM,2005.
[8]	Sherstove A A,Stone P.Improving action selection in MDP''s via knowledge transfer[A].Proceedings of the Twentieth National Conference on Artificial Intelligence[C].New York:ACM,2005.
[9]	Taylor M E,Kuhlmann G,Stone P.Autonomous transfer for reinforcement learning[A].Proceedings of the Seventeenth International Conference on Autonomous Agents and Multi-Agent Systems[C].Estoril,Portugal:IFAAMAS,2008.
[10]	Bowling M,Veloso M.Reusing learned policies between similar problems[A].Proceedings of AI * IA-98 Workshop on New Trends in Robotics[C].Berlin,Germany:Springer Verlag.1998.
[11]	Pickett M,Barto A G.PolicyBlocks:an algorithm for creating useful macro-actions in reinforcement learning[A].Proceedings of the Nineteenth International Conference on Machine Learning[C].San Francisco:Morgan Kaufmann,2002.506-513.
[12]	Mcgovern A,Barto A G.Automatic discovery of subgoals in reinforcement learning using diverse density[A].Proceedings of the Eighteenth International Conference on Machine Learning[C].San Francisco:Morgan Kaufmann,2001.361-368.
[13]	Mehta N,Natarajan S,Tadepalli P,A Fern.Transfer in variable-reward hierarchical reinforcement learning[A].Proceedings of tile NIPS-05 Workshop on Inductive Transfer[C].Cambridge:MIT Press,2005.360-366.
[14]	Soni V,Singh S.Using homomorphisms to transfer options across continuous reinforcement learning domains[A].Proceedings of the Twenty-first National Conference on Machine Learning[C].Boston:AAAI Press,2006.
[15]	Madden M G,Howley T.Transfer of experience between reinforcement learning environments with progressive difficulty[J].Artificial Intelligence Review,2004,21 (3):375-398.
[16]	Driessens K,Ramon J,Croonenborghs T.Transfer learning for reinforcement learning through goal and policy parameterization[A].Proceedings of the ICML-06 Workshop on Structural Knowledge Transfer for Machine Learning[C].New York:ACM,2006.
[17]	Torrey L,Shavlik J,Walker T,Maclin R.Skill acquisition via transfer learning and advice taking[A].Proceedings of the Seventeenth European Conference on Machine teaming[C].Berlin,Germany:Springer,2006.425-436.
[18]	Anderson J R.Cognitive Psychology and Its Applications(third edition)[M].New York:Freeman,1990.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133