Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jerzy Stefanowski

PREF-XAI: Preference-Based Personalized Rule Explanations of Black-Box Machine Learning Models

Apr 21, 2026

Salvatore Greco, Jacek Karolczak, Roman Słowiński, Jerzy Stefanowski

Abstract:Explainable artificial intelligence (XAI) has predominantly focused on generating model-centric explanations that approximate the behavior of black-box models. However, such explanations often overlook a fundamental aspect of interpretability: different users require different explanations depending on their goals, preferences, and cognitive constraints. Although recent work has explored user-centric and personalized explanations, most existing approaches rely on heuristic adaptations or implicit user modeling, lacking a principled framework for representing and learning individual preferences. In this paper, we consider Preference-Based Explainable Artificial Intelligence (PREF-XAI), a novel perspective that reframes explanation as a preference-driven decision problem. Within PREF-XAI, explanations are not treated as fixed outputs, but as alternatives to be evaluated and selected according to user-specific criteria. In the PREF-XAI perspective, here we propose a methodology that combines rule-based explanations with formal preference learning. User preferences are elicited through a ranking of a small set of candidate explanations and modeled via an additive utility function inferred using robust ordinal regression. Experimental results on real-world datasets show that PREF-XAI can accurately reconstruct user preferences from limited feedback, identify highly relevant explanations, and discover novel explanatory rules not initially considered by the user. Beyond the proposed methodology, this work establishes a connection between XAI and preference learning, opening new directions for interactive and adaptive explanation systems.

Via

Access Paper or Ask Questions

A Probabilistic Consensus-Driven Approach for Robust Counterfactual Explanations

Apr 19, 2026

Marcin Kostrzewa, Maciej Zięba, Jerzy Stefanowski

Abstract:Counterfactual explanations (CFEs) are essential for interpreting black-box models, yet they often become invalid when models are slightly changed. Existing methods for generating robust CFEs are often limited to specific types of models, require costly tuning, or inflexible robustness controls. We propose a novel approach that jointly models the data distribution and the space of plausible model decisions to ensure robustness to model changes. Using a probabilistic consensus over a model ensemble, we train a conditional normalizing flow that captures the data density under varying levels of classifier agreement. At inference time, a single interpretable parameter controls the robustness level; it specifies the minimum fraction of models that should agree on the target class without retraining the generative model. Our method effectively pushes CFEs toward regions that are both plausible and stable across model changes. Experimental results demonstrate that our approach achieves superior empirical robustness while also maintaining good performance across other evaluation measures.

Via

Access Paper or Ask Questions

Towards Differentiating Between Failures and Domain Shifts in Industrial Data Streams

Mar 09, 2026

Natalia Wojak-Strzelecka, Szymon Bobek, Grzegorz J. Nalepa, Jerzy Stefanowski

Abstract:Anomaly and failure detection methods are crucial in identifying deviations from normal system operational conditions, which allows for actions to be taken in advance, usually preventing more serious damages. Long-lasting deviations indicate failures, while sudden, isolated changes in the data indicate anomalies. However, in many practical applications, changes in the data do not always represent abnormal system states. Such changes may be recognized incorrectly as failures, while being a normal evolution of the system, e.g. referring to characteristics of starting the processing of a new product, i.e. realizing a domain shift. Therefore, distinguishing between failures and such ''healthy'' changes in data distribution is critical to ensure the practical robustness of the system. In this paper, we propose a method that not only detects changes in the data distribution and anomalies but also allows us to distinguish between failures and normal domain shifts inherent to a given process. The proposed method consists of a modified Page-Hinkley changepoint detector for identification of the domain shift and possible failures and supervised domain-adaptation-based algorithms for fast, online anomaly detection. These two are coupled with an explainable artificial intelligence (XAI) component that aims at helping the human operator to finally differentiate between domain shifts and failures. The method is illustrated by an experiment on a data stream from the steel factory.

Via

Access Paper or Ask Questions

An interpretable prototype parts-based neural network for medical tabular data

Mar 05, 2026

Jacek Karolczak, Jerzy Stefanowski

Abstract:The ability to interpret machine learning model decisions is critical in such domains as healthcare, where trust in model predictions is as important as their accuracy. Inspired by the development of prototype parts-based deep neural networks in computer vision, we propose a new model for tabular data, specifically tailored to medical records, that requires discretization of diagnostic result norms. Unlike the original vision models that rely on the spatial structure, our method employs trainable patching over features describing a patient, to learn meaningful prototypical parts from structured data. These parts are represented as binary or discretized feature subsets. This allows the model to express prototypes in human-readable terms, enabling alignment with clinical language and case-based reasoning. Our proposed neural network is inherently interpretable and offers interpretable concept-based predictions by comparing the patient's description to learned prototypes in the latent space of the network. In experiments, we demonstrate that the model achieves classification performance competitive to widely used baseline models on medical benchmark datasets, while also offering transparency, bridging the gap between predictive performance and interpretability in clinical decision support.

* Proc. of EXPLIMED at ECAI 2025

Via

Access Paper or Ask Questions

Explaining Concept Drift through the Evolution of Group Counterfactuals

Sep 11, 2025

Ignacy Stępka, Jerzy Stefanowski

Figure 1 for Explaining Concept Drift through the Evolution of Group Counterfactuals

Figure 2 for Explaining Concept Drift through the Evolution of Group Counterfactuals

Figure 3 for Explaining Concept Drift through the Evolution of Group Counterfactuals

Figure 4 for Explaining Concept Drift through the Evolution of Group Counterfactuals

Abstract:Machine learning models in dynamic environments often suffer from concept drift, where changes in the data distribution degrade performance. While detecting this drift is a well-studied topic, explaining how and why the model's decision-making logic changes still remains a significant challenge. In this paper, we introduce a novel methodology to explain concept drift by analyzing the temporal evolution of group-based counterfactual explanations (GCEs). Our approach tracks shifts in the GCEs' cluster centroids and their associated counterfactual action vectors before and after a drift. These evolving GCEs act as an interpretable proxy, revealing structural changes in the model's decision boundary and its underlying rationale. We operationalize this analysis within a three-layer framework that synergistically combines insights from the data layer (distributional shifts), the model layer (prediction disagreement), and our proposed explanation layer. We show that such holistic view allows for a more comprehensive diagnosis of drift, making it possible to distinguish between different root causes, such as a spatial data shift versus a re-labeling of concepts.

* TempXAI Workshop @ ECML PKDD 2025

Via

Access Paper or Ask Questions

This part looks alike this: identifying important parts of explained instances and prototypes

May 08, 2025

Jacek Karolczak, Jerzy Stefanowski

Figure 1 for This part looks alike this: identifying important parts of explained instances and prototypes

Figure 2 for This part looks alike this: identifying important parts of explained instances and prototypes

Figure 3 for This part looks alike this: identifying important parts of explained instances and prototypes

Figure 4 for This part looks alike this: identifying important parts of explained instances and prototypes

Abstract:Although prototype-based explanations provide a human-understandable way of representing model predictions they often fail to direct user attention to the most relevant features. We propose a novel approach to identify the most informative features within prototypes, termed alike parts. Using feature importance scores derived from an agnostic explanation method, it emphasizes the most relevant overlapping features between an instance and its nearest prototype. Furthermore, the feature importance score is incorporated into the objective function of the prototype selection algorithms to promote global prototypes diversity. Through experiments on six benchmark datasets, we demonstrate that the proposed approach improves user comprehension while maintaining or even increasing predictive accuracy.

Via

Access Paper or Ask Questions

DetoxAI: a Python Toolkit for Debiasing Deep Learning Models in Computer Vision

May 02, 2025

Ignacy Stępka, Lukasz Sztukiewicz, Michał Wiliński, Jerzy Stefanowski

Abstract:While machine learning fairness has made significant progress in recent years, most existing solutions focus on tabular data and are poorly suited for vision-based classification tasks, which rely heavily on deep learning. To bridge this gap, we introduce DetoxAI, an open-source Python library for improving fairness in deep learning vision classifiers through post-hoc debiasing. DetoxAI implements state-of-the-art debiasing algorithms, fairness metrics, and visualization tools. It supports debiasing via interventions in internal representations and includes attribution-based visualization tools and quantitative algorithmic fairness metrics to show how bias is mitigated. This paper presents the motivation, design, and use cases of DetoxAI, demonstrating its tangible value to engineers and researchers.

Via

Access Paper or Ask Questions

Properties of fairness measures in the context of varying class imbalance and protected group ratios

Nov 13, 2024

Dariusz Brzezinski, Julia Stachowiak, Jerzy Stefanowski, Izabela Szczech, Robert Susmaga, Sofya Aksenyuk, Uladzimir Ivashka, Oleksandr Yasinskyi

Figure 1 for Properties of fairness measures in the context of varying class imbalance and protected group ratios

Figure 2 for Properties of fairness measures in the context of varying class imbalance and protected group ratios

Figure 3 for Properties of fairness measures in the context of varying class imbalance and protected group ratios

Figure 4 for Properties of fairness measures in the context of varying class imbalance and protected group ratios

Abstract:Society is increasingly relying on predictive models in fields like criminal justice, credit risk management, or hiring. To prevent such automated systems from discriminating against people belonging to certain groups, fairness measures have become a crucial component in socially relevant applications of machine learning. However, existing fairness measures have been designed to assess the bias between predictions for protected groups without considering the imbalance in the classes of the target variable. Current research on the potential effect of class imbalance on fairness focuses on practical applications rather than dataset-independent measure properties. In this paper, we study the general properties of fairness measures for changing class and protected group proportions. For this purpose, we analyze the probability mass functions of six of the most popular group fairness measures. We also measure how the probability of achieving perfect fairness changes for varying class imbalance ratios. Moreover, we relate the dataset-independent properties of fairness measures described in this paper to classifier fairness in real-life tasks. Our results show that measures such as Equal Opportunity and Positive Predictive Parity are more sensitive to changes in class imbalance than Accuracy Equality. These findings can help guide researchers and practitioners in choosing the most appropriate fairness measures for their classification problems.

Via

Access Paper or Ask Questions

Improving Online Bagging for Complex Imbalanced Data Stream

Oct 04, 2024

Bartosz Przybyl, Jerzy Stefanowski

Abstract:Learning classifiers from imbalanced and concept drifting data streams is still a challenge. Most of the current proposals focus on taking into account changes in the global imbalance ratio only and ignore the local difficulty factors, such as the minority class decomposition into sub-concepts and the presence of unsafe types of examples (borderline or rare ones). As the above factors present in the stream may deteriorate the performance of popular online classifiers, we propose extensions of resampling online bagging, namely Neighbourhood Undersampling or Oversampling Online Bagging to take better account of the presence of unsafe minority examples. The performed computational experiments with synthetic complex imbalanced data streams have shown their advantage over earlier variants of online bagging resampling ensembles.

* 16 pages, 4 figures

Via

Access Paper or Ask Questions

Counterfactual Explanations with Probabilistic Guarantees on their Robustness to Model Change

Aug 09, 2024

Ignacy Stępka, Mateusz Lango, Jerzy Stefanowski

Figure 1 for Counterfactual Explanations with Probabilistic Guarantees on their Robustness to Model Change

Figure 2 for Counterfactual Explanations with Probabilistic Guarantees on their Robustness to Model Change

Figure 3 for Counterfactual Explanations with Probabilistic Guarantees on their Robustness to Model Change

Figure 4 for Counterfactual Explanations with Probabilistic Guarantees on their Robustness to Model Change

Abstract:Counterfactual explanations (CFEs) guide users on how to adjust inputs to machine learning models to achieve desired outputs. While existing research primarily addresses static scenarios, real-world applications often involve data or model changes, potentially invalidating previously generated CFEs and rendering user-induced input changes ineffective. Current methods addressing this issue often support only specific models or change types, require extensive hyperparameter tuning, or fail to provide probabilistic guarantees on CFE robustness to model changes. This paper proposes a novel approach for generating CFEs that provides probabilistic guarantees for any model and change type, while offering interpretable and easy-to-select hyperparameters. We establish a theoretical framework for probabilistically defining robustness to model change and demonstrate how our BetaRCE method directly stems from it. BetaRCE is a post-hoc method applied alongside a chosen base CFE generation method to enhance the quality of the explanation beyond robustness. It facilitates a transition from the base explanation to a more robust one with user-adjusted probability bounds. Through experimental comparisons with baselines, we show that BetaRCE yields robust, most plausible, and closest to baseline counterfactual explanations.

Via

Access Paper or Ask Questions