OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

控制理论与应用 2002

On-line optimization algorithm for Markov control processes based on a single sample path
Markov控制过程基于单个样本轨道的在线优化算法

TANG Hao,XI Hong-sheng,YIN Bao-qun,
唐　昊,奚宏生,殷保群

Keywords: Markov control processes,Markov performance potentials,randomized stationary policies,on-line optimization
Markov控制过程,Markov性能势,随机平稳策略,在线优化

Full-Text Cite this paper Add to My Lib

Abstract:

Based on the theory of Markov performance potentials, this paper studies a performance optimization algorithm for Markov control processes. Different from the traditional computation-based approaches, this algorithm could estimate the gradients of performance with respect to the policy parameters by simulating a single sample path, and look for an optimal (or suboptimal) randomized stationary policy. The algorithm provided here could satisfy the needs of on-line optimization of many different real-world engineering systems, because we can select suitable parameters in the algorithm according to the properties of a real system. Finally, the convergence of the algorithm with probability one on an infinite sample path is considered, and a numerical example for a three-state controlled Markov chain is provided.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

On-line optimization algorithm for Markov control processes based on a single sample pathMarkov控制过程基于单个样本轨道的在线优化算法

On-line optimization algorithm for Markov control processes based on a single sample path
Markov控制过程基于单个样本轨道的在线优化算法