Finding shortcuts from episode in multi-agent reinforcement learning
Citations Over TimeTop 23% of 2009 papers
Abstract
In multi-agent reinforcement learning, the state space grows exponentially in terms of the number of agents, which makes the training episode longer than before. It will take more time to make learning convergent. In order to improve the efficiency of the convergence, we propose an algorithm to find shortcuts from episode in multi-agent reinforcement learning to speed up convergence. The loops that indicate the ineffective paths in the episode are removed, but all the shortest state paths from each other state to the goal state within the original episode are kept, that means no loss of state space knowledge when remove these loops. So the length of episode is shortened to speed up the convergence. Since a large mount of episodes are included in learning process, the overall improvement accumulated from every episode's improvement will be considerable. The episode of multi-agent pursuit problem is used to illustrate the effectiveness of our algorithm. We believe this algorithm can be introduced into most other reinforcement learning approaches for speeding up convergence, because its improvement is made on episode, which is the most foundational learning unit of reinforcement learning.
Related Papers
- → Review on Reinforcement-concrete Bonded Anchorage(2022)1 cited
- → Variable Ratio Reinforcement and Differential Reinforcement(1974)
- → The Effect of Vicarious Reinforcement on Inappropriate Behavior in an Elementary School Classroom(1975)
- → Spatially and Seamlessly Hierarchical Reinforcement Learning for State Space and Policy space in Autonomous Driving(2021)