Abstract:Explainability and fairness have mainly been considered separately, with recent exceptions trying the explain the sources of unfairness. This paper shows that the Shapley value can be used to both define and explain unfairness, under standard group fairness criteria. This offers an integrated framework to estimate and derive inference on unfairness as-well-as the features that contribute to it. Our framework can also be extended from Shapley values to the family of Efficient-Symmetric-Linear (ESL) values, some of which offer more robust definitions of fairness, and shorter computation times. An illustration is run on the Census Income dataset from the UCI Machine Learning Repository. Our approach shows that ``Age", ``Number of hours" and ``Marital status" generate gender unfairness, using shorter computation time than traditional Bootstrap tests.
Abstract:We introduce Aumann-SHAP, an interaction-aware framework that decomposes counterfactual transitions by restricting the model to a local hypercube connecting baseline and counterfactual features. Each hyper-cube is decomposed into a grid in order to construct an induced micro-player cooperative game in which elementary grid-step moves become players. Shapley and LES values on this TU-micro-game yield: (i) within-pot contribution of each feature to the interaction with other features (interaction explainability), and (ii) the contribution of each instance and each feature to the counterfactual analysis (individual and global explainability). In particular, Aumann-LES values produce individual and global explanations along the counterfactual transition. Shapley and LES values converge to the diagonal Aumann-Shapley (integrated-gradients) attribution method. Experiments on the German Credit dataset and MNIST data show that Aumann-LES produces robust results and better explanations than the standard Shapley value during the counterfactual transition.




Abstract:This paper introduces innovative enhancements to the K-means and K-nearest neighbors (KNN) algorithms based on the concept of Gini prametric spaces. Unlike traditional distance metrics, Gini-based measures incorporate both value-based and rank-based information, improving robustness to noise and outliers. The main contributions of this work include: proposing a Gini-based measure that captures both rank information and value distances; presenting a Gini K-means algorithm that is proven to converge and demonstrates resilience to noisy data; and introducing a Gini KNN method that performs competitively with state-of-the-art approaches such as Hassanat's distance in noisy environments. Experimental evaluations on 14 datasets from the UCI repository demonstrate the superior performance and efficiency of Gini-based algorithms in clustering and classification tasks. This work opens new avenues for leveraging rank-based measures in machine learning and statistical analysis.