Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Felix Liedeker

Do Metrics for Counterfactual Explanations Align with User Perception?

Mar 16, 2026

Felix Liedeker, Basil Ell, Philipp Cimiano, Christoph Düsing

Abstract:Explainability is widely regarded as essential for trustworthy artificial intelligence systems. However, the metrics commonly used to evaluate counterfactual explanations are algorithmic evaluation metrics that are rarely validated against human judgments of explanation quality. This raises the question of whether such metrics meaningfully reflect user perceptions. We address this question through an empirical study that directly compares algorithmic evaluation metrics with human judgments across three datasets. Participants rated counterfactual explanations along multiple dimensions of perceived quality, which we relate to a comprehensive set of standard counterfactual metrics. We analyze both individual relationships and the extent to which combinations of metrics can predict human assessments. Our results show that correlations between algorithmic metrics and human ratings are generally weak and strongly dataset-dependent. Moreover, increasing the number of metrics used in predictive models does not lead to reliable improvements, indicating structural limitations in how current metrics capture criteria relevant for humans. Overall, our findings suggest that widely used counterfactual evaluation metrics fail to reflect key aspects of explanation quality as perceived by users, underscoring the need for more human-centered approaches to evaluating explainable artificial intelligence.

* Accepted at the 4th World Conference on eXplainable Artificial Intelligence (XAI 2026)

Via

Access Paper or Ask Questions

A User Study Evaluating Argumentative Explanations in Diagnostic Decision Support

May 15, 2025

Felix Liedeker, Olivia Sanchez-Graillet, Moana Seidler, Christian Brandt, Jörg Wellmer, Philipp Cimiano

Figure 1 for A User Study Evaluating Argumentative Explanations in Diagnostic Decision Support

Figure 2 for A User Study Evaluating Argumentative Explanations in Diagnostic Decision Support

Figure 3 for A User Study Evaluating Argumentative Explanations in Diagnostic Decision Support

Figure 4 for A User Study Evaluating Argumentative Explanations in Diagnostic Decision Support

Abstract:As the field of healthcare increasingly adopts artificial intelligence, it becomes important to understand which types of explanations increase transparency and empower users to develop confidence and trust in the predictions made by machine learning (ML) systems. In shared decision-making scenarios where doctors cooperate with ML systems to reach an appropriate decision, establishing mutual trust is crucial. In this paper, we explore different approaches to generating explanations in eXplainable AI (XAI) and make their underlying arguments explicit so that they can be evaluated by medical experts. In particular, we present the findings of a user study conducted with physicians to investigate their perceptions of various types of AI-generated explanations in the context of diagnostic decision support. The study aims to identify the most effective and useful explanations that enhance the diagnostic process. In the study, medical doctors filled out a survey to assess different types of explanations. Further, an interview was carried out post-survey to gain qualitative insights on the requirements of explanations incorporated in diagnostic decision support. Overall, the insights gained from this study contribute to understanding the types of explanations that are most effective.

* Presented at 'The First Workshop on Natural Language Argument-Based Explanations', co-located with ECAI 2024

Via

Access Paper or Ask Questions