Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Adrien Schertzer

Joint Learning in the Gaussian Single Index Model

May 27, 2025

Loucas Pillaud-Vivien, Adrien Schertzer

Abstract:We consider the problem of jointly learning a one-dimensional projection and a univariate function in high-dimensional Gaussian models. Specifically, we study predictors of the form $f(x)=\varphi^\star(\langle w^\star, x \rangle)$, where both the direction $w^\star \in \mathcal{S}_{d-1}$, the sphere of $\mathbb{R}^d$, and the function $\varphi^\star: \mathbb{R} \to \mathbb{R}$ are learned from Gaussian data. This setting captures a fundamental non-convex problem at the intersection of representation learning and nonlinear regression. We analyze the gradient flow dynamics of a natural alternating scheme and prove convergence, with a rate controlled by the information exponent reflecting the \textit{Gaussian regularity} of the function $\varphi^\star$. Strikingly, our analysis shows that convergence still occurs even when the initial direction is negatively correlated with the target. On the practical side, we demonstrate that such joint learning can be effectively implemented using a Reproducing Kernel Hilbert Space (RKHS) adapted to the structure of the problem, enabling efficient and flexible estimation of the univariate function. Our results offer both theoretical insight and practical methodology for learning low-dimensional structure in high-dimensional settings.

* 31 Pages, 3 Figures

Via

Access Paper or Ask Questions

Stochastic Differential Equations models for Least-Squares Stochastic Gradient Descent

Jul 02, 2024

Adrien Schertzer, Loucas Pillaud-Vivien

Figure 1 for Stochastic Differential Equations models for Least-Squares Stochastic Gradient Descent

Figure 2 for Stochastic Differential Equations models for Least-Squares Stochastic Gradient Descent

Figure 3 for Stochastic Differential Equations models for Least-Squares Stochastic Gradient Descent

Abstract:We study the dynamics of a continuous-time model of the Stochastic Gradient Descent (SGD) for the least-square problem. Indeed, pursuing the work of Li et al. (2019), we analyze Stochastic Differential Equations (SDEs) that model SGD either in the case of the training loss (finite samples) or the population one (online setting). A key qualitative feature of the dynamics is the existence of a perfect interpolator of the data, irrespective of the sample size. In both scenarios, we provide precise, non-asymptotic rates of convergence to the (possibly degenerate) stationary distribution. Additionally, we describe this asymptotic distribution, offering estimates of its mean, deviations from it, and a proof of the emergence of heavy-tails related to the step-size magnitude. Numerical simulations supporting our findings are also presented.

Via

Access Paper or Ask Questions