0 citations

Markov Decision Processes

2014pp. 207–268

Citations Over Time

Abstract

The literature on inference and planning is vast. This chapter presents a type of decision processes in which the state dynamics are Markov. Such a process, called a Markov decision process (MDP), makes sense in many situations as a reasonable model and have in fact found applications in a wide range of practical problems. An MDP is a decision process in which the next state S[n + 1] of the environment, or the system, is completely determined by the current state of the system denoted by S[n] and the action (or the decision) taken at current time an. The chapter explains finite-horizons MDPs and infinite-horizon MDPs. Policy iteration and value iteration can be used to compute a sequence of value functions for a finite-horizon partially observable Markov decision process (POMDP) with increasing horizon length until the change is negligible as an approximation to the infinite-horizon optimal value function.

Related Papers

→ State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms(1982)745 cited
→ POMDP-lite for robust robot planning under uncertainty(2016)48 cited
→ Assume-guarantee reasoning framework for MDP-POMDP(2016)7 cited
Survey of algorithms for partially observable Markov decision processes(2008)
→ POMDP-lite for Robust Robot Planning under Uncertainty(2016)4 cited