OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

软件学报 2008

Optimal Action Criterion and Algorithm Improvement of Real-Time Dynamic Programming
实时动态规划的最优行动判据及算法改进

FAN Chang-jie,CHEN Xiao-ping,
范长杰,陈小平

Keywords: MDP (Markov decision process),RTDP (real-time dynamic programming),convergence criterion,incremental solving,heuristic search
马尔可夫决策过程,实时动态规划,收敛判据,增量求解,启发式搜索

Full-Text Cite this paper Add to My Lib

Abstract:

This paper is primarily to improve the efficiency of real-time dynamic programming (RTDP) algorithm for solving Markov decision problems.Several typical convergence criteria are compared and analyzed.A criterion called optimal action criterion and a corresponding branch strategy are proposed on the basis of the upper and lower bound theory.This criterion guarantees that the agent can act earlier in a real-time decision process while an optimal policy with sufficient precision still remains.It can be proved that under certain conditions one can obtain an optimal policy with arbitrary precision by using such an incremental method.With these new techniques,a bounded incremental real-time dynamic programming (BI-RTDP) algorithm is designed.In the experiments of two typical real-time simulation systems,BI-RTDP outperforms the other state-of-the-art RTDP algorithms tested.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

Optimal Action Criterion and Algorithm Improvement of Real-Time Dynamic Programming实时动态规划的最优行动判据及算法改进

Optimal Action Criterion and Algorithm Improvement of Real-Time Dynamic Programming
实时动态规划的最优行动判据及算法改进