Abstract:Fairness metrics utilizing the area under the receiver operator characteristic curve (AUC) have gained increasing attention in high-stakes domains such as healthcare, finance, and criminal justice. In these domains, fairness is often evaluated over risk scores rather than binary outcomes, and a common challenge is that enforcing strict fairness can significantly degrade AUC performance. To address this challenge, we propose Fair Proportional Optimal Transport (FairPOT), a novel, model-agnostic post-processing framework that strategically aligns risk score distributions across different groups using optimal transport, but does so selectively by transforming a controllable proportion, i.e., the top-lambda quantile, of scores within the disadvantaged group. By varying lambda, our method allows for a tunable trade-off between reducing AUC disparities and maintaining overall AUC performance. Furthermore, we extend FairPOT to the partial AUC setting, enabling fairness interventions to concentrate on the highest-risk regions. Extensive experiments on synthetic, public, and clinical datasets show that FairPOT consistently outperforms existing post-processing techniques in both global and partial AUC scenarios, often achieving improved fairness with slight AUC degradation or even positive gains in utility. The computational efficiency and practical adaptability of FairPOT make it a promising solution for real-world deployment.
Abstract:Participants enrolled into randomized controlled trials (RCTs) often do not reflect real-world populations. Previous research in how best to translate RCT results to target populations has focused on weighting RCT data to look like the target data. Simulation work, however, has suggested that an outcome model approach may be preferable. Here we describe such an approach using source data from the 2x2 factorial NAVIGATOR trial which evaluated the impact of valsartan and nateglinide on cardiovascular outcomes and new-onset diabetes in a pre-diabetic population. Our target data consisted of people with pre-diabetes serviced at our institution. We used Random Survival Forests to develop separate outcome models for each of the 4 treatments, estimating the 5-year risk difference for progression to diabetes and estimated the treatment effect in our local patient populations, as well as sub-populations, and the results compared to the traditional weighting approach. Our models suggested that the treatment effect for valsartan in our patient population was the same as in the trial, whereas for nateglinide treatment effect was stronger than observed in the original trial. Our effect estimates were more efficient than the weighting approach.