Victoria Krakovna
Publications by Year
Research Areas
Reinforcement Learning in Robotics, Ethics and Social Impacts of AI, Adversarial Robustness in Machine Learning, Software Engineering Research, Natural Language Processing Techniques
Most-Cited Works
- → AI Safety Gridworlds(2017)117 cited
- → The Ethics of Advanced AI Assistants(2024)37 cited
- → Penalizing side effects using stepwise relative reachability(2018)23 cited
- → Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective(2021)22 cited
- → Reinforcement Learning with a Corrupted Reward Channel(2017)16 cited
- → Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals(2022)14 cited
- Measuring and avoiding side effects using relative reachability(2018)
- → Avoiding Side Effects By Considering Future Tasks(2020)11 cited
- Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input(2016)
- → Modeling AGI Safety Frameworks with Causal Influence Diagrams(2019)9 cited