Akbir Khan
Publications by Year
Research Areas
Topic Modeling, Adversarial Robustness in Machine Learning, Natural Language Processing Techniques, Reinforcement Learning in Robotics, Experimental Behavioral Economics Studies
Most-Cited Works
- → The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs(2022)21 cited
- → Alignment faking in large language models(2024)17 cited
- → Debating with More Persuasive LLMs Leads to More Truthful Answers(2024)9 cited
- → Considering Race a Problem of Transfer Learning(2019)9 cited
- → Multi-Agent Risks from Advanced AI(2025)7 cited
- → MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning(2023)3 cited
- → Language Models Learn to Mislead Humans via RLHF(2024)2 cited
- → JaxMARL: Multi-Agent RL Environments and Algorithms in JAX(2023)2 cited
- → Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats(2024)1 cited
- → BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games(2024)1 cited