Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Novi Quadrianto

University of Cambridge

Safe Fairness Guarantees Without Demographics in Classification: Spectral Uncertainty Set Perspective

Feb 12, 2026

Ainhize Barrainkua, Santiago Mazuelas, Novi Quadrianto, Jose A. Lozano

Abstract:As automated classification systems become increasingly prevalent, concerns have emerged over their potential to reinforce and amplify existing societal biases. In the light of this issue, many methods have been proposed to enhance the fairness guarantees of classifiers. Most of the existing interventions assume access to group information for all instances, a requirement rarely met in practice. Fairness without access to demographic information has often been approached through robust optimization techniques,which target worst-case outcomes over a set of plausible distributions known as the uncertainty set. However, their effectiveness is strongly influenced by the chosen uncertainty set. In fact, existing approaches often overemphasize outliers or overly pessimistic scenarios, compromising both overall performance and fairness. To overcome these limitations, we introduce SPECTRE, a minimax-fair method that adjusts the spectrum of a simple Fourier feature mapping and constrains the extent to which the worst-case distribution can deviate from the empirical distribution. We perform extensive experiments on the American Community Survey datasets involving 20 states. The safeness of SPECTRE comes as it provides the highest average values on fairness guarantees together with the smallest interquartile range in comparison to state-of-the-art approaches, even compared to those with access to demographic group information. In addition, we provide a theoretical analysis that derives computable bounds on the worst-case error for both individual groups and the overall population, as well as characterizes the worst-case distributions responsible for these extremal performances

Via

Access Paper or Ask Questions

Dissecting Performative Prediction: A Comprehensive Survey

Feb 10, 2026

Thomas Kehrenberg, Javier Sanguino, Jose A. Lozano, Novi Quadrianto

Abstract:The field of performative prediction had its beginnings in 2020 with the seminal paper "Performative Prediction" by Perdomo et al., which established a novel machine learning setup where the deployment of a predictive model causes a distribution shift in the environment, which in turn causes a mismatch between the distribution expected by the predictive model and the real distribution. This shift is defined by a so-called distribution map. In the half-decade since, a literature has emerged which has, among other things, introduced new solution concepts to the original setup, extended the setup, offered new theoretical analyses, and examined the intersection of performative prediction and other established fields. In this survey, we first lay out the performative prediction setting and explain the different optimization targets: performative stability and performative optimality. We introduce a new way of classifying different performative prediction settings, based on how much information is available about the distribution map. We survey existing implementations of distribution maps and existing methods to address the problem of performative prediction, while examining different ways to categorize them. Finally, we point out known and previously unknown connections that can be drawn to other fields, in the hopes of stimulating future research.

Via

Access Paper or Ask Questions

Who Pays for Fairness? Rethinking Recourse under Social Burden

Sep 04, 2025

Ainhize Barrainkua, Giovanni De Toni, Jose Antonio Lozano, Novi Quadrianto

Figure 1 for Who Pays for Fairness? Rethinking Recourse under Social Burden

Figure 2 for Who Pays for Fairness? Rethinking Recourse under Social Burden

Figure 3 for Who Pays for Fairness? Rethinking Recourse under Social Burden

Figure 4 for Who Pays for Fairness? Rethinking Recourse under Social Burden

Abstract:Machine learning based predictions are increasingly used in sensitive decision-making applications that directly affect our lives. This has led to extensive research into ensuring the fairness of classifiers. Beyond just fair classification, emerging legislation now mandates that when a classifier delivers a negative decision, it must also offer actionable steps an individual can take to reverse that outcome. This concept is known as algorithmic recourse. Nevertheless, many researchers have expressed concerns about the fairness guarantees within the recourse process itself. In this work, we provide a holistic theoretical characterization of unfairness in algorithmic recourse, formally linking fairness guarantees in recourse and classification, and highlighting limitations of the standard equal cost paradigm. We then introduce a novel fairness framework based on social burden, along with a practical algorithm (MISOB), broadly applicable under real-world conditions. Empirical results on real-world datasets show that MISOB reduces the social burden across all groups without compromising overall classifier accuracy.

Via

Access Paper or Ask Questions

The Decoupled Risk Landscape in Performative Prediction

Jun 10, 2025

Javier Sanguino, Thomas Kehrenberg, Jose A. Lozano, Novi Quadrianto

Figure 1 for The Decoupled Risk Landscape in Performative Prediction

Figure 2 for The Decoupled Risk Landscape in Performative Prediction

Figure 3 for The Decoupled Risk Landscape in Performative Prediction

Figure 4 for The Decoupled Risk Landscape in Performative Prediction

Abstract:Performative Prediction addresses scenarios where deploying a model induces a distribution shift in the input data, such as individuals modifying their features and reapplying for a bank loan after rejection. Literature has had a theoretical perspective giving mathematical guarantees for convergence (either to the stable or optimal point). We believe that visualization of the loss landscape can complement this theoretical advances with practical insights. Therefore, (1) we introduce a simple decoupled risk visualization method inspired in the two-step process that performative prediction is. Our approach visualizes the risk landscape with respect to two parameter vectors: model parameters and data parameters. We use this method to propose new properties of the interest points, to examine how existing algorithms traverse the risk landscape and perform under more realistic conditions, including strategic classification with non-linear models. (2) Building on this decoupled risk visualization, we introduce a novel setting - extended Performative Prediction - which captures scenarios where the distribution reacts to a model different from the decision-making one, reflecting the reality that agents often lack full access to the deployed model.

Via

Access Paper or Ask Questions

Diversity-Driven Learning: Tackling Spurious Correlations and Data Heterogeneity in Federated Models

Apr 15, 2025

Gergely D. Németh, Eros Fanì, Yeat Jeng Ng, Barbara Caputo, Miguel Ángel Lozano, Nuria Oliver, Novi Quadrianto

Figure 1 for Diversity-Driven Learning: Tackling Spurious Correlations and Data Heterogeneity in Federated Models

Figure 2 for Diversity-Driven Learning: Tackling Spurious Correlations and Data Heterogeneity in Federated Models

Figure 3 for Diversity-Driven Learning: Tackling Spurious Correlations and Data Heterogeneity in Federated Models

Figure 4 for Diversity-Driven Learning: Tackling Spurious Correlations and Data Heterogeneity in Federated Models

Abstract:Federated Learning (FL) enables decentralized training of machine learning models on distributed data while preserving privacy. However, in real-world FL settings, client data is often non-identically distributed and imbalanced, resulting in statistical data heterogeneity which impacts the generalization capabilities of the server's model across clients, slows convergence and reduces performance. In this paper, we address this challenge by first proposing a characterization of statistical data heterogeneity by means of 6 metrics of global and client attribute imbalance, class imbalance, and spurious correlations. Next, we create and share 7 computer vision datasets for binary and multiclass image classification tasks in Federated Learning that cover a broad range of statistical data heterogeneity and hence simulate real-world situations. Finally, we propose FedDiverse, a novel client selection algorithm in FL which is designed to manage and leverage data heterogeneity across clients by promoting collaboration between clients with complementary data distributions. Experiments on the seven proposed FL datasets demonstrate FedDiverse's effectiveness in enhancing the performance and robustness of a variety of FL methods while having low communication and computational overhead.

Via

Access Paper or Ask Questions

Efficient Online Inference of Vision Transformers by Training-Free Tokenization

Nov 23, 2024

Leonidas Gee, Wing Yan Li, Viktoriia Sharmanska, Novi Quadrianto

Figure 1 for Efficient Online Inference of Vision Transformers by Training-Free Tokenization

Figure 2 for Efficient Online Inference of Vision Transformers by Training-Free Tokenization

Figure 3 for Efficient Online Inference of Vision Transformers by Training-Free Tokenization

Figure 4 for Efficient Online Inference of Vision Transformers by Training-Free Tokenization

Abstract:The cost of deploying vision transformers increasingly represents a barrier to wider industrial adoption. Existing compression requires additional end-to-end fine-tuning or incurs a significant drawback to runtime, thus making them ill-suited for online inference. We introduce the $\textbf{Visual Word Tokenizer}$ (VWT), a training-free method for reducing energy costs while retaining performance and runtime. The VWT groups patches (visual subwords) that are frequently used into visual words while infrequent ones remain intact. To do so, intra-image or inter-image statistics are leveraged to identify similar visual concepts for compression. Experimentally, we demonstrate a reduction in wattage of up to 19% with only a 20% increase in runtime at most. Comparative approaches of 8-bit quantization and token merging achieve a lower or similar energy efficiency but exact a higher toll on runtime (up to $2\times$ or more). Our results indicate that VWTs are well-suited for efficient online inference with a marginal compromise on performance.

Via

Access Paper or Ask Questions

Dancing in the Shadows: Harnessing Ambiguity for Fairer Classifiers

Jun 27, 2024

Ainhize Barrainkua, Paula Gordaliza, Jose A. Lozano, Novi Quadrianto

Figure 1 for Dancing in the Shadows: Harnessing Ambiguity for Fairer Classifiers

Figure 2 for Dancing in the Shadows: Harnessing Ambiguity for Fairer Classifiers

Figure 3 for Dancing in the Shadows: Harnessing Ambiguity for Fairer Classifiers

Figure 4 for Dancing in the Shadows: Harnessing Ambiguity for Fairer Classifiers

Abstract:This paper introduces a novel approach to bolster algorithmic fairness in scenarios where sensitive information is only partially known. In particular, we propose to leverage instances with uncertain identity with regards to the sensitive attribute to train a conventional machine learning classifier. The enhanced fairness observed in the final predictions of this classifier highlights the promising potential of prioritizing ambiguity (i.e., non-normativity) as a means to improve fairness guarantees in real-world classification tasks.

* Presented at the XI Symposium of Theory and Applications of Data Mining from the XX Conference of the Spanish Association for Artificial Intelligence CAEPIA 2024

Via

Access Paper or Ask Questions

Are Compressed Language Models Less Subgroup Robust?

Mar 26, 2024

Leonidas Gee, Andrea Zugarini, Novi Quadrianto

Figure 1 for Are Compressed Language Models Less Subgroup Robust?

Figure 2 for Are Compressed Language Models Less Subgroup Robust?

Figure 3 for Are Compressed Language Models Less Subgroup Robust?

Figure 4 for Are Compressed Language Models Less Subgroup Robust?

Abstract:To reduce the inference cost of large language models, model compression is increasingly used to create smaller scalable models. However, little is known about their robustness to minority subgroups defined by the labels and attributes of a dataset. In this paper, we investigate the effects of 18 different compression methods and settings on the subgroup robustness of BERT language models. We show that worst-group performance does not depend on model size alone, but also on the compression method used. Additionally, we find that model compression does not always worsen the performance on minority subgroups. Altogether, our analysis serves to further research into the subgroup robustness of model compression.

* Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Main Track
* The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)

Via

Access Paper or Ask Questions

Addressing Membership Inference Attack in Federated Learning with Model Compression

Nov 29, 2023

Gergely Dániel Németh, Miguel Ángel Lozano, Novi Quadrianto, Nuria Oliver

Abstract:Federated Learning (FL) has been proposed as a privacy-preserving solution for machine learning. However, recent works have shown that Federated Learning can leak private client data through membership attacks. In this paper, we show that the effectiveness of these attacks on the clients negatively correlates with the size of the client datasets and model complexity. Based on this finding, we propose model-agnostic Federated Learning as a privacy-enhancing solution because it enables the use of models of varying complexity in the clients. To this end, we present $\texttt{MaPP-FL}$, a novel privacy-aware FL approach that leverages model compression on the clients while keeping a full model on the server. We compare the performance of $\texttt{MaPP-FL}$ against state-of-the-art model-agnostic FL methods on the CIFAR-10, CIFAR-100, and FEMNIST vision datasets. Our experiments show the effectiveness of $\texttt{MaPP-FL}$ in preserving the clients' and the server's privacy while achieving competitive classification accuracies.

Via

Access Paper or Ask Questions

Uncertainty in Fairness Assessment: Maintaining Stable Conclusions Despite Fluctuations

Feb 02, 2023

Ainhize Barrainkua, Paula Gordaliza, Jose A. Lozano, Novi Quadrianto

Figure 1 for Uncertainty in Fairness Assessment: Maintaining Stable Conclusions Despite Fluctuations

Figure 2 for Uncertainty in Fairness Assessment: Maintaining Stable Conclusions Despite Fluctuations

Figure 3 for Uncertainty in Fairness Assessment: Maintaining Stable Conclusions Despite Fluctuations

Figure 4 for Uncertainty in Fairness Assessment: Maintaining Stable Conclusions Despite Fluctuations

Abstract:Several recent works encourage the use of a Bayesian framework when assessing performance and fairness metrics of a classification algorithm in a supervised setting. We propose the Uncertainty Matters (UM) framework that generalizes a Beta-Binomial approach to derive the posterior distribution of any criteria combination, allowing stable performance assessment in a bias-aware setting.We suggest modeling the confusion matrix of each demographic group using a Multinomial distribution updated through a Bayesian procedure. We extend UM to be applicable under the popular K-fold cross-validation procedure. Experiments highlight the benefits of UM over classical evaluation frameworks regarding informativeness and stability.

* 25 pages (including references and appendix), 10 figures. Submitted to ICML 2023

Via

Access Paper or Ask Questions