a public good project by the
Synthesis
Company
of California

© 2026

Kshitij Sachan | doi.page

0 works0 citations0 h-index

Google Scholar OpenAlex

Kshitij Sachan

Publications by Year

Research Areas

Adversarial Robustness in Machine Learning, Topic Modeling, Neural Networks and Applications, Ethics and Social Impacts of AI, Model Reduction and Neural Networks

Most-Cited Works

→ Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training(2024)31 cited
→ Debating with More Persuasive LLMs Leads to More Truthful Answers(2024)9 cited
→ Polysemanticity and Capacity in Neural Networks(2022)4 cited
→ AI Control: Improving Safety Despite Intentional Subversion(2023)3 cited