0 citations0 references

Metrics and Continuity in Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence2021Vol. 35(9), pp. 8261–8269

Citations Over TimeTop 17% of 2021 papers

Charline Le Lan, Marc G. Bellemare, Pablo Samuel Castro

Abstract

In most practical applications of reinforcement learning, it is untenable to maintain direct estimates for individual states; in continuous-state systems, it is impossible. Instead, researchers often leverage {\em state similarity} (whether explicitly or implicitly) to build models that can generalize well from a limited set of samples. The notion of state similarity used, and the neighbourhoods and topologies they induce, is thus of crucial importance, as it will directly affect the performance of the algorithms. Indeed, a number of recent works introduce algorithms assuming the existence of "well-behaved" neighbourhoods, but leave the full specification of such topologies for future work. In this paper we introduce a unified formalism for defining these topologies through the lens of metrics. We establish a hierarchy amongst these metrics and demonstrate their theoretical implications on the Markov Decision Process specifying the reinforcement learning problem. We complement our theoretical results with empirical evaluations showcasing the differences between the metrics considered.

Related Papers

RESEARCH ON MARKOV GAME-BASED MULTIAGENT REINFORCEMENT LEARNING MODEL AND ALGORITHMS(2000)
→ Policy Gradient using Weak Derivatives for Reinforcement Learning(2019)2 cited
→ Customized Dynamic Pricing for Air Cargo Network via Reinforcement Learning(2020)1 cited
→ RVI reinforcement learning for semi-Markov decision processes with average reward(2010)1 cited
Research on Agent Reinforcement Learning Policy Based on DFS(2010)