Jonathan Uesato
Publications by Year
Research Areas
Adversarial Robustness in Machine Learning, Topic Modeling, Explainable Artificial Intelligence (XAI), Natural Language Processing Techniques, Anomaly Detection Techniques and Applications
Most-Cited Works
- → Taxonomy of Risks posed by Language Models(2022)518 cited
- → Technical Report on the CleverHans v2.1.0 Adversarial Examples Library(2016)404 cited
- → Adversarial Risk and the Dangers of Evaluating Against Weak Attacks(2018)304 cited
- → On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models(2018)300 cited
- → Scaling Language Models: Methods, Analysis & Insights from Training Gopher(2021)242 cited
- → Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples(2020)144 cited
- → Scalable Verified Training for Provably Robust Image Classification(2019)137 cited
- → Improving alignment of dialogue agents via targeted human judgements(2022)131 cited
- Adversarial Risk and the Dangers of Evaluating Against Weak Attacks.(2018)
- Are Labels Required for Improving Adversarial Robustness(2019)