0 citations0 references

Q-Value Weighted Regression: Reinforcement Learning with Limited Data

2022 International Joint Conference on Neural Networks (IJCNN)2022Vol. 70, pp. 1–8

Citations Over Time

Piotr Kozakowski, Łukasz Kaiser, Henryk Michalewski, Afroz Mohiuddin, Katarzyna Kańska

Abstract

Sample efficiency has emerged as a significant challenge of deep reinforcement learning. We introduce Q-Value Weighted Regression (QWR), a simple RL algorithm that excels in this aspect. QWR builds upon Advantage Weighted Regression (AWR), an off-policy actor-critic algorithm that performs very well on continuous control tasks, but has low sample efficiency and struggles with high-dimensional observation spaces. We perform both theoretical and empirical analyses of AWR, that explain its shortcomings and use these insights to motivate QWR. We show experimentally that QWR either outperforms or matches the state-of-the-art algorithms both on tasks with continuous and discrete actions. In particular, QWR yields results on par with SAC on the MuJoCo suite and - with the same set of hyperparameters - outperforms a highly tuned implementation of Rainbow on a set of Atari games. At the same time, QWR is a much simpler algorithm than both SAC and Rainbow.

Citations Over Time

Abstract

Related Papers