Vladimir Mikulik
Publications by Year
Research Areas
Topic Modeling, Adversarial Robustness in Machine Learning, Explainable Artificial Intelligence (XAI), Natural Language Processing Techniques, Machine Learning in Materials Science
Most-Cited Works
- → Scaling Language Models: Methods, Analysis & Insights from Training Gopher(2021)242 cited
- → Teaching language models to support answers with verified quotes(2022)53 cited
- → Alignment of Language Agents(2021)41 cited
- → Risks from Learned Optimization in Advanced Machine Learning Systems(2019)25 cited
- → Meta-trained agents implement Bayes-optimal agents(2020)24 cited
- → Neural networks are a priori biased towards Boolean functions with low entropy(2019)11 cited
- → Algorithms for Causal Reasoning in Probability Trees(2020)10 cited
- → Tracr: Compiled Transformers as a Laboratory for Interpretability(2023)6 cited
- → Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla(2023)6 cited