Decision making is one of the central problems in artificial intelligence and specifically in robotics. In most cases this problem comes with uncertainty both in data received by the decision maker/agent and in the actions performed in the environment. One effective method to solve this problem is to model the environment and the agent as a Partially Observable Markov Decision Process (POMDP). A POMDP has a wide range of applications such as: Machine Vision, Marketing, Network troubleshooting, Medical diagnosis etc. In recent years, there has been a significant interest in developing techniques for finding policies for (POMDPs).We consider two new techniques, called Recursive Point Filter (RPF) and Scan Line Filter (SCF) based on Incremental Pruning (IP) POMDP solver to introduce an alternative method to Linear Programming (LP) filter for IP. Both, RPF and SCF have solutions for several POMDP problems that LP could not converge to in 24 hours. Experiments are run on problems from POMDP literature, and an Average Discounted Reward (ADR) is computed by testing the policy in a simulated environment.
A. Cassandra, M. L. Littman and N. L. Zhang, “Incremental Pruning: A Simple, Fast, Exact Algorithm for Partially Observable Markov Decision Processes,” Proceedings of the 13th Annual Conference on Uncertainty in Artificial Intelligence, Brown, 1997.
A. R. Cassandra, L. P. Kaelbling and M. L. Littman, “Acting Optimally in Partially Observable Stochastic Domains,” Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, 1994.