0 citations0 references

Convergence of value iterations for total-cost MDPs and POMDPs with general state and action sets

2014Vol. 8, pp. 1–8

Citations Over TimeTop 10% of 2014 papers

Eugene A. Feinberg, Pavlo O. Kasyanov, Michael Z. Zgurovsky

Abstract

This paper describes conditions for convergence to optimal values of the dynamic programming algorithm applied to total-cost Markov Decision Processes (MDPSs) with Borel state and action sets and with possibly unbounded one-step cost functions. It also studies applications of these results to Partially Observable MDPs (POMDPs). It is well-known that POMDPs can be reduced to special MDPs, called Completely Observable MDPs (COMDPs), whose state spaces are sets of probabilities of the original states. This paper describes conditions on POMDPs under which optimal policies for COMDPs can be found by value iteration. In other words, this paper provides sufficient conditions for solving total-costs POMDPs with infinite state, observation and action sets by dynamic programming. Examples of applications to filtration, identification, and inventory control are provided.

Citations Over TimeTop 10% of 2014 papers

Abstract

Related Papers