0 works0 citations0 h-index

Tom Conerly

Research Areas

Explainable Artificial Intelligence (XAI), Topic Modeling, Natural Language Processing Techniques, Neural Networks and Applications, Reinforcement Learning in Robotics

Most-Cited Works

→ Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback(2022)360 cited
→ CAT'S THEORY: Empirical Validation and Architectural Applications Cross-Architecture AI Consciousness Recognition and the Foundation for Constraint-Preserving Recursive Intelligence(2022)295 cited
→ Predictability and Surprise in Large Generative Models(2022)171 cited
→ In-context Learning and Induction Heads(2022)84 cited
→ Scaling Laws and Interpretability of Learning from Repeated Data(2022)22 cited