Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nathan Stromberg

Correcting Class Imbalance in Prior-Data Fitted Networks for Tabular Classification

May 20, 2026

Samuel McDowell, Nathan Stromberg, Lalitha Sankar

Abstract:Prior-data fitted networks (PFNs) have achieved exceptional performance on tabular classification tasks. However, like other classifiers, their performance can suffer under the effect of class imbalance, resulting in poor performance for rare classes. Several techniques exist which attempt to mitigate the deleterious effect of class imbalance on classification performance, but the in-context learning (ICL) dynamic of PFNs means that loss-based strategies are impossible, and other techniques are unproven. We have adapted several classical techniques addressing class imbalance and analyzed their performance on PFN classification. We observe that thresholding performs exceptionally well because of the calibration characteristics of PFNs, and downsampling performs comparably because of PFNs exceptional limited-data performance, with the additional benefit of reduced computation cost for inference.

* 5 pages, 6 figures, Information Theory Workshop (ITW)

Via

Access Paper or Ask Questions

Lower Bounds on the MMSE of Adversarially Inferring Sensitive Features

May 13, 2025

Monica Welfert, Nathan Stromberg, Mario Diaz, Lalitha Sankar

Abstract:We propose an adversarial evaluation framework for sensitive feature inference based on minimum mean-squared error (MMSE) estimation with a finite sample size and linear predictive models. Our approach establishes theoretical lower bounds on the true MMSE of inferring sensitive features from noisy observations of other correlated features. These bounds are expressed in terms of the empirical MMSE under a restricted hypothesis class and a non-negative error term. The error term captures both the estimation error due to finite number of samples and the approximation error from using a restricted hypothesis class. For linear predictive models, we derive closed-form bounds, which are order optimal in terms of the noise variance, on the approximation error for several classes of relationships between the sensitive and non-sensitive features, including linear mappings, binary symmetric channels, and class-conditional multi-variate Gaussian distributions. We also present a new lower bound that relies on the MSE computed on a hold-out validation dataset of the MMSE estimator learned on finite-samples and a restricted hypothesis class. Through empirical evaluation, we demonstrate that our framework serves as an effective tool for MMSE-based adversarial evaluation of sensitive feature inference that balances theoretical guarantees with practical efficiency.

* submitted to IEEE Transactions on Information Theory

Via

Access Paper or Ask Questions

Label Noise Robustness for Domain-Agnostic Fair Corrections via Nearest Neighbors Label Spreading

Jun 13, 2024

Nathan Stromberg, Rohan Ayyagari, Sanmi Koyejo, Richard Nock, Lalitha Sankar

Abstract:Last-layer retraining methods have emerged as an efficient framework for correcting existing base models. Within this framework, several methods have been proposed to deal with correcting models for subgroup fairness with and without group membership information. Importantly, prior work has demonstrated that many methods are susceptible to noisy labels. To this end, we propose a drop-in correction for label noise in last-layer retraining, and demonstrate that it achieves state-of-the-art worst-group accuracy for a broad range of symmetric label noise and across a wide variety of datasets exhibiting spurious correlations. Our proposed approach uses label spreading on a latent nearest neighbors graph and has minimal computational overhead compared to existing methods.

Via

Access Paper or Ask Questions

Theoretical Guarantees of Data Augmented Last Layer Retraining Methods

May 09, 2024

Monica Welfert, Nathan Stromberg, Lalitha Sankar

Figure 1 for Theoretical Guarantees of Data Augmented Last Layer Retraining Methods

Figure 2 for Theoretical Guarantees of Data Augmented Last Layer Retraining Methods

Figure 3 for Theoretical Guarantees of Data Augmented Last Layer Retraining Methods

Figure 4 for Theoretical Guarantees of Data Augmented Last Layer Retraining Methods

Abstract:Ensuring fair predictions across many distinct subpopulations in the training data can be prohibitive for large models. Recently, simple linear last layer retraining strategies, in combination with data augmentation methods such as upweighting, downsampling and mixup, have been shown to achieve state-of-the-art performance for worst-group accuracy, which quantifies accuracy for the least prevalent subpopulation. For linear last layer retraining and the abovementioned augmentations, we present the optimal worst-group accuracy when modeling the distribution of the latent representations (input to the last layer) as Gaussian for each subpopulation. We evaluate and verify our results for both synthetic and large publicly available datasets.

* Extended version of a paper accepted to ISIT 2024. arXiv admin note: text overlap with arXiv:2402.11039

Via

Access Paper or Ask Questions

Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains

Feb 16, 2024

Nathan Stromberg, Rohan Ayyagari, Monica Welfert, Sanmi Koyejo, Lalitha Sankar

Figure 1 for Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains

Figure 2 for Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains

Figure 3 for Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains

Figure 4 for Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains

Abstract:Existing methods for last layer retraining that aim to optimize worst-group accuracy (WGA) rely heavily on well-annotated groups in the training data. We show, both in theory and practice, that annotation-based data augmentations using either downsampling or upweighting for WGA are susceptible to domain annotation noise, and in high-noise regimes approach the WGA of a model trained with vanilla empirical risk minimization. We introduce Regularized Annotation of Domains (RAD) in order to train robust last layer classifiers without the need for explicit domain annotations. Our results show that RAD is competitive with other recently proposed domain annotation-free techniques. Most importantly, RAD outperforms state-of-the-art annotation-reliant methods even with only 5% noise in the training data for several publicly available datasets.

Via

Access Paper or Ask Questions

Smoothly Giving up: Robustness for Simple Models

Feb 17, 2023

Tyler Sypherd, Nathan Stromberg, Richard Nock, Visar Berisha, Lalitha Sankar

Figure 1 for Smoothly Giving up: Robustness for Simple Models

Figure 2 for Smoothly Giving up: Robustness for Simple Models

Figure 3 for Smoothly Giving up: Robustness for Simple Models

Figure 4 for Smoothly Giving up: Robustness for Simple Models

Abstract:There is a growing need for models that are interpretable and have reduced energy and computational cost (e.g., in health care analytics and federated learning). Examples of algorithms to train such models include logistic regression and boosting. However, one challenge facing these algorithms is that they provably suffer from label noise; this has been attributed to the joint interaction between oft-used convex loss functions and simpler hypothesis classes, resulting in too much emphasis being placed on outliers. In this work, we use the margin-based $\alpha$-loss, which continuously tunes between canonical convex and quasi-convex losses, to robustly train simple models. We show that the $\alpha$ hyperparameter smoothly introduces non-convexity and offers the benefit of "giving up" on noisy training examples. We also provide results on the Long-Servedio dataset for boosting and a COVID-19 survey dataset for logistic regression, highlighting the efficacy of our approach across multiple relevant domains.

* To appear in AISTATS 2023

Via

Access Paper or Ask Questions