Jan Leike
Publications by Year
Research Areas
Reinforcement Learning in Robotics, Machine Learning and Algorithms, Advanced Bandit Algorithms Research, Topic Modeling, Natural Language Processing Techniques
Most-Cited Works
- → Training language models to follow instructions with human feedback(2022)4,260 cited
- → Evaluating Large Language Models Trained on Code(2021)1,403 cited
- → Deep reinforcement learning from human preferences(2017)508 cited
- → Scalable agent alignment via reward modeling: a research direction(2018)124 cited
- → AI Safety Gridworlds(2017)117 cited
- → Learning to Understand Goal Specifications by Modelling Reward(2018)69 cited
- → Recursively Summarizing Books with Human Feedback(2021)66 cited
- → On Thompson Sampling and Asymptotic Optimality(2017)59 cited
- → Linear Ranking for Linear Lasso Programs(2013)49 cited
- → Self-critiquing models for assisting human evaluators(2022)46 cited