a public good project by the
Synthesis
Company
of California

© 2026

Chris Olah | doi.page

0 works0 citations0 h-index

Google Scholar OpenAlex

Chris Olah

Publications by Year

Research Areas

Topic Modeling, Natural Language Processing Techniques, Adversarial Robustness in Machine Learning, Neural Networks and Applications, Explainable Artificial Intelligence (XAI)

Most-Cited Works

→ TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems(2016)9,726 cited
→ Deconvolution and Checkerboard Artifacts(2016)1,638 cited
→ Feature Visualization(2017)803 cited
→ The Building Blocks of Interpretability(2018)603 cited
→ Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback(2022)360 cited
→ Zoom In: An Introduction to Circuits(2020)242 cited
→ Multimodal Neurons in Artificial Neural Networks(2021)213 cited
→ Language Models (Mostly) Know What They Know(2022)159 cited
→ Activation Atlas(2019)135 cited
→ Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned(2022)99 cited