Abstract:This paper introduces a novel kernel-based formulation of the Equalized Odds (EO) criterion, denoted as $EO_k$, for fair representation learning (FRL) in supervised settings. The central goal of FRL is to mitigate discrimination regarding a sensitive attribute $S$ while preserving prediction accuracy for the target variable $Y$. Our proposed criterion enables a rigorous and interpretable quantification of three core fairness objectives: independence (prediction $\hat{Y}$ is independent of $S$), separation (also known as equalized odds; prediction $\hat{Y}$ is independent with $S$ conditioned on target attribute $Y$), and calibration ($Y$ is independent of $S$ conditioned on the prediction $\hat{Y}$). Under both unbiased ($Y$ is independent of $S$) and biased ($Y$ depends on $S$) conditions, we show that $EO_k$ satisfies both independence and separation in the former, and uniquely preserves predictive accuracy while lower bounding independence and calibration in the latter, thereby offering a unified analytical characterization of the tradeoffs among these fairness criteria. We further define the empirical counterpart, $\hat{EO}_k$, a kernel-based statistic that can be computed in quadratic time, with linear-time approximations also available. A concentration inequality for $\hat{EO}_k$ is derived, providing performance guarantees and error bounds, which serve as practical certificates of fairness compliance. While our focus is on theoretical development, the results lay essential groundwork for principled and provably fair algorithmic design in future empirical studies.
Abstract:Maximum Mean Discrepancy (MMD) is a probability metric that has found numerous applications in machine learning. In this work, we focus on its application in generative models, including the minimum MMD estimator, Generative Moment Matching Network (GMMN), and Generative Adversarial Network (GAN). In these cases, MMD is part of an objective function in a minimization or min-max optimization problem. Even if its empirical performance is competitive, the consistency and convergence rate analysis of the corresponding MMD-based estimators has yet to be carried out. We propose a uniform concentration inequality for a class of Maximum Mean Discrepancy (MMD)-based estimators, that is, a maximum deviation bound of empirical MMD values over a collection of generated distributions and adversarially learned kernels. Here, our inequality serves as an efficient tool in the theoretical analysis for MMD-based generative models. As elaborating examples, we applied our main result to provide the generalization error bounds for the MMD-based estimators in the context of the minimum MMD estimator and MMD GAN.