Gabriel Mukobi
Stanford University(US)
Publications by Year
Research Areas
Explainable Artificial Intelligence (XAI), Topic Modeling, Ethics and Social Impacts of AI, International Relations and Foreign Policy, Natural Language Processing Techniques
Most-Cited Works
- → Escalation Risks from Language Models in Military and Diplomatic Decision-Making(2024)31 cited
- → The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning(2024)13 cited
- → Open Problems in Technical AI Governance(2024)5 cited
- → Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?(2024)4 cited
- → SuperHF: Supervised Iterative Learning from Human Feedback(2023)2 cited
- → Welfare Diplomacy: Benchmarking Language Model Cooperation(2023)1 cited
- → Reasons to Doubt the Impact of AI Risk Evaluations(2024)1 cited
- Opportunities in Physics Education: Low-Cost Position Tracking for Use in Kinematics Labs(2018)