Get our free extension to see links to code for papers anywhere online!Free extension: code links for papers anywhere!Free add-on: See code for papers anywhere!

Let $(x_{i}, y_{i})_{i=1,\dots,n}$ denote independent samples from a general mixture distribution $\sum_{c\in\mathcal{C}}\rho_{c}P_{c}^{x}$, and consider the hypothesis class of generalized linear models $\hat{y} = F(\Theta^{\top}x)$. In this work, we investigate the asymptotic joint statistics of the family of generalized linear estimators $(\Theta_{1}, \dots, \Theta_{M})$ obtained either from (a) minimizing an empirical risk $\hat{R}_{n}(\Theta;X,y)$ or (b) sampling from the associated Gibbs measure $\exp(-\beta n \hat{R}_{n}(\Theta;X,y))$. Our main contribution is to characterize under which conditions the asymptotic joint statistics of this family depends (on a weak sense) only on the means and covariances of the class conditional features distribution $P_{c}^{x}$. In particular, this allow us to prove the universality of different quantities of interest, such as the training and generalization errors, redeeming a recent line of work in high-dimensional statistics working under the Gaussian mixture hypothesis. Finally, we discuss the applications of our results to different machine learning tasks of interest, such as ensembling and uncertainty