Abstract:Artificial-intelligence systems are becoming ubiquitous in society, yet their predictions typically inherit biases with respect to protected attributes such as race, gender, or age. Classical fairness notions, most notably Statistical Parity (SP), demand that predictions be independent of the protected attributes, but are overly restrictive when these attributes influence mediating variables that are considered business necessities. Recent causal formulations relax SP by distinguishing allowed from not-allowed causal paths and by complementing SP with Predictive Parity (PP), requiring the predictor to replicate the legitimate influence of business-necessities. Existing path-based definitions are mainly practical when applied to categorical attributes. This paper introduces a new framework for fairness in structural causal models that is tailored to continuous protected attributes. We formalize SP and PP through path-specific partial derivatives, establish conditions under which these criteria coincide with prior causal definitions, and characterize when a fair predictor, one that satisfies SP along not-allowed paths while achieving PP along allowed paths, exists. Building on this theory, we propose a fair tuning algorithm that either constructs such a predictor or, when not possible, allows for a trade-off between SP and PP. We present experiments on simulated and real data to evaluate our proposal, compare it with previously proposed methods, and show that it performs better when PP is considered.



Abstract:Various methods have recently been proposed to estimate causal effects with confidence intervals that are uniformly valid over a set of data generating processes when high-dimensional nuisance models are estimated by post-model-selection or machine learning estimators. These methods typically require that all the confounders are observed to ensure identification of the effects. We contribute by showing how valid semiparametric inference can be obtained in the presence of unobserved confounders and high-dimensional nuisance models. We propose uncertainty intervals which allow for unobserved confounding, and show that the resulting inference is valid when the amount of unobserved confounding is small relative to the sample size; the latter is formalized in terms of convergence rates. Simulation experiments illustrate the finite sample properties of the proposed intervals and investigate an alternative procedure that improves the empirical coverage of the intervals when the amount of unobserved confounding is large. Finally, a case study on the effect of smoking during pregnancy on birth weight is used to illustrate the use of the methods introduced to perform a sensitivity analysis to unobserved confounding.