0 citations0 references

A dynamic programming algorithm for decentralized Markov decision processes with a broadcast structure

2010pp. 6143–6148

Citations Over TimeTop 12% of 2010 papers

Abstract

We give an optimal dynamic programming algorithm to solve a class of finite-horizon decentralized Markov decision processes (MDPs). We consider problems with a broadcast information structure that consists of a central node that only has access to its own state but can affect several outer nodes, while each outer node has access to both its own state and the central node's state, but cannot affect the other nodes. The solution to this problem involves a dynamic program similar to that of a centralized partially-observed Markov decision process.

Related Papers

APRICODD: Approximate Policy Construction Using Decision Diagrams(2000)
→ Convergence of value iterations for total-cost MDPs and POMDPs with general state and action sets(2014)11 cited
→ Introduction to Markov Decision Processes(2011)22 cited
Approximate Dynamic Programming and Reinforcement Learning - Algorithms, Analysis and an Application(2018)
→ Dynamic Programming(2023)