Abstract:Identifying low-dimensional sufficient structures in nonlinear sufficient dimension reduction (SDR) has long been a fundamental yet challenging problem. Most existing methods lack theoretical guarantees of exhaustiveness in identifying lower dimensional structures, either at the population level or at the sample level. We tackle this issue by proposing a new method, generative sufficient dimension reduction (GenSDR), which leverages modern generative models. We show that GenSDR is able to fully recover the information contained in the central $σ$-field at both the population and sample levels. In particular, at the sample level, we establish a consistency property for the GenSDR estimator from the perspective of conditional distributions, capitalizing on the distributional learning capabilities of deep generative models. Moreover, by incorporating an ensemble technique, we extend GenSDR to accommodate scenarios with non-Euclidean responses, thereby substantially broadening its applicability. Extensive numerical results demonstrate the outstanding empirical performance of GenSDR and highlight its strong potential for addressing a wide range of complex, real-world tasks.




Abstract:This paper investigates the connection between neural networks and sufficient dimension reduction (SDR), demonstrating that neural networks inherently perform SDR in regression tasks under appropriate rank regularizations. Specifically, the weights in the first layer span the central mean subspace. We establish the statistical consistency of the neural network-based estimator for the central mean subspace, underscoring the suitability of neural networks in addressing SDR-related challenges. Numerical experiments further validate our theoretical findings, and highlight the underlying capability of neural networks to facilitate SDR compared to the existing methods. Additionally, we discuss an extension to unravel the central subspace, broadening the scope of our investigation.