Near-Optimal Reinforcement Learning in Polynomial Time
Machine Learning2002Vol. 49(2-3), pp. 209–232
Citations Over TimeTop 1% of 2002 papers
Related Papers
- RESEARCH ON MARKOV GAME-BASED MULTIAGENT REINFORCEMENT LEARNING MODEL AND ALGORITHMS(2000)
- → Policy Gradient using Weak Derivatives for Reinforcement Learning(2019)2 cited
- → Customized Dynamic Pricing for Air Cargo Network via Reinforcement Learning(2020)1 cited
- → RVI reinforcement learning for semi-Markov decision processes with average reward(2010)1 cited
- Research on Agent Reinforcement Learning Policy Based on DFS(2010)