a public good project by the
Synthesis
Company
of California

© 2026

Q-learning | doi.page

0 citations0 references

Google Scholar doi.org

Q-learning

Machine Learning·1992·Vol. 8(3-4), pp. 279–292

Citations Over TimeTop 1% of 1992 papers

Christopher J. Watkins, Peter Dayan

Related Papers

A Generalized Reinforcement-Learning Model: Convergence and Applications(1996)
→ Mixed Reinforcement Learning for Partially Observable Markov Decision Process(2007)5 cited
→ Reinforcement learning for MDPs using temporal difference schemes(2002)4 cited
→ Policy Gradient using Weak Derivatives for Reinforcement Learning(2019)2 cited
→ Convergence of the Q-ae learning under deterministic MDPs and its efficiency under the stochastic environment(2002)3 cited