No Explainability without Accountability
Citations Over TimeTop 10% of 2020 papers
Abstract
Automatically generated explanations of how machine learning (ML) models reason can help users understand and accept them. However, explanations can have unintended consequences: promoting over-reliance or undermining trust. This paper investigates how explanations shape users' perceptions of ML models with or without the ability to provide feedback to them: (1) does revealing model flaws increase users' desire to "fix" them; (2) does providing explanations cause users to believe - wrongly - that models are introspective, and will thus improve over time. Through two controlled experiments - varying model quality - we show how the combination of explanations and user feedback impacted perceptions, such as frustration and expectations of model improvement. Explanations without opportunity for feedback were frustrating with a lower quality model, while interactions between explanation and feedback for the higher quality model suggest that detailed feedback should not be requested without explanation. Users expected model correction, regardless of whether they provided feedback or received explanations.
Related Papers
- → Introspective Capabilities in Large Language Models(2023)8 cited
- → The Ins and Outs of Introspection(2006)10 cited
- → Models of Introspection vs. Introspective Devices Testing the Research Programme for Possible Forms of Introspection(2023)2 cited
- Teachers' Introspection and Introspection Strategies(2006)
- → The Routes of Introspection(2023)