Markov Decision Processes
Citations Over Time
Abstract
The literature on inference and planning is vast. This chapter presents a type of decision processes in which the state dynamics are Markov. Such a process, called a Markov decision process (MDP), makes sense in many situations as a reasonable model and have in fact found applications in a wide range of practical problems. An MDP is a decision process in which the next state S[n + 1] of the environment, or the system, is completely determined by the current state of the system denoted by S[n] and the action (or the decision) taken at current time an. The chapter explains finite-horizons MDPs and infinite-horizon MDPs. Policy iteration and value iteration can be used to compute a sequence of value functions for a finite-horizon partially observable Markov decision process (POMDP) with increasing horizon length until the change is negligible as an approximation to the infinite-horizon optimal value function.
Related Papers
- → State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms(1982)745 cited
- → POMDP-lite for robust robot planning under uncertainty(2016)48 cited
- → Assume-guarantee reasoning framework for MDP-POMDP(2016)7 cited
- Survey of algorithms for partially observable Markov decision processes(2008)
- → POMDP-lite for Robust Robot Planning under Uncertainty(2016)4 cited