%0 Journal Article
%T Optimal Action Criterion and Algorithm Improvement of Real-Time Dynamic Programming
实时动态规划的最优行动判据及算法改进
%A FAN Chang-jie
%A CHEN Xiao-ping
%A
范长杰
%A 陈小平
%J 软件学报
%D 2008
%I
%X This paper is primarily to improve the efficiency of real-time dynamic programming (RTDP) algorithm for solving Markov decision problems.Several typical convergence criteria are compared and analyzed.A criterion called optimal action criterion and a corresponding branch strategy are proposed on the basis of the upper and lower bound theory.This criterion guarantees that the agent can act earlier in a real-time decision process while an optimal policy with sufficient precision still remains.It can be proved that under certain conditions one can obtain an optimal policy with arbitrary precision by using such an incremental method.With these new techniques,a bounded incremental real-time dynamic programming (BI-RTDP) algorithm is designed.In the experiments of two typical real-time simulation systems,BI-RTDP outperforms the other state-of-the-art RTDP algorithms tested.
%K MDP (Markov decision process)
%K RTDP (real-time dynamic programming)
%K convergence criterion
%K incremental solving
%K heuristic search
马尔可夫决策过程
%K 实时动态规划
%K 收敛判据
%K 增量求解
%K 启发式搜索
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=1799D00876B34736E2CCE4C5749B09AA&yid=67289AFF6305E306&vid=2A8D03AD8076A2E3&iid=708DD6B15D2464E8&sid=2C9C419B408CCD32&eid=E74D4F95E117560C&journal_id=1000-9825&journal_name=软件学报&referenced_num=1&reference_num=14