Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhonghua Liu

Deep Neural Network Training as Random Effects: An Optimization-Inference Duality

May 27, 2026

Minhao Yao, Ruoyu Wang, Xihong Lin, Lin Liu, Zhonghua Liu

Abstract:Deep neural networks (DNNs) have achieved remarkable empirical success, yet their training dynamics remain understood mainly from optimization rather than statistical principles. Here we develop a statistical framework for DNN training in the over-parameterized regime by showing that the prediction induced by continuous-time neural tangent kernel (NTK) gradient flow is exactly equivalent to that from a classical random-effects model. In this framework, training time acts as a variance component, or equivalently an empirical Bayes covariance hyperparameter, governing the allocation of variation from noise to structured signal. This equivalence reveals an optimization-inference duality: the gradient-flow path is both an optimization trajectory and an empirical Bayes random-effects inference path. Conditional on training time, the network output is the posterior mean of the latent signal, and estimating training time by restricted maximum likelihood (REML) turns early stopping into likelihood-based empirical Bayes inference rather than external tuning. This perspective yields a two-stage inferential procedure. First, a variance-component test determines whether DNN training captures statistically significant structure beyond initialization. Second, conditional on training being warranted, REML provides a likelihood-based early stopping rule. The resulting stopping time admits a spectral interpretation in the NTK eigenbasis, where training proceeds until spectral loss decorrelation is achieved. We further establish that REML-guided early stopping achieves asymptotically optimal prediction error for fixed-design in-sample prediction and, under additional random-design regularity conditions, for out-of-sample prediction. This work reframes DNN training as statistical inference and provides a principled foundation for deciding whether and how long to train deep neural networks.

Via

Access Paper or Ask Questions

DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation

Aug 04, 2024

Qinshuo Liu, Zixin Wang, Xi-An Li, Xinyao Ji, Lei Zhang, Lin Liu, Zhonghua Liu

Figure 1 for DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation

Figure 2 for DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation

Figure 3 for DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation

Figure 4 for DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation

Abstract:Semiparametric statistics play a pivotal role in a wide range of domains, including but not limited to missing data, causal inference, and transfer learning, to name a few. In many settings, semiparametric theory leads to (nearly) statistically optimal procedures that yet involve numerically solving Fredholm integral equations of the second kind. Traditional numerical methods, such as polynomial or spline approximations, are difficult to scale to multi-dimensional problems. Alternatively, statisticians may choose to approximate the original integral equations by ones with closed-form solutions, resulting in computationally more efficient, but statistically suboptimal or even incorrect procedures. To bridge this gap, we propose a novel framework by formulating the semiparametric estimation problem as a bi-level optimization problem; and then we develop a scalable algorithm called Deep Neural-Nets Assisted Semiparametric Estimation (DNA-SE) by leveraging the universal approximation property of Deep Neural-Nets (DNN) to streamline semiparametric procedures. Through extensive numerical experiments and a real data analysis, we demonstrate the numerical and statistical advantages of $\dnase$ over traditional methods. To the best of our knowledge, we are the first to bring DNN into semiparametric statistics as a numerical solver of integral equations in our proposed general framework.

* semiparametric statistics, missing data, causal inference, Fredholm integral equations of the second kind, bi-level optimization, deep learning, AI for science

Via

Access Paper or Ask Questions

Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech Augmentation

Jun 21, 2023

Zhonghua Liu, Shijun Wang, Ning Chen

Abstract:Voice Conversion (VC) converts the voice of a source speech to that of a target while maintaining the source's content. Speech can be mainly decomposed into four components: content, timbre, rhythm and pitch. Unfortunately, most related works only take into account content and timbre, which results in less natural speech. Some recent works are able to disentangle speech into several components, but they require laborious bottleneck tuning or various hand-crafted features, each assumed to contain disentangled speech information. In this paper, we propose a VC model that can automatically disentangle speech into four components using only two augmentation functions, without the requirement of multiple hand-crafted features or laborious bottleneck tuning. The proposed model is straightforward yet efficient, and the empirical results demonstrate that our model can achieve a better performance than the baseline, regarding disentanglement effectiveness and speech naturalness.

* Accepted by INTERSPEECH2023

Via

Access Paper or Ask Questions

DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning

Oct 10, 2022

Siqi Xu, Lin Liu, Zhonghua Liu

Figure 1 for DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning

Figure 2 for DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning

Figure 3 for DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning

Figure 4 for DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning

Abstract:Causal mediation analysis can unpack the black box of causality and is therefore a powerful tool for disentangling causal pathways in biomedical and social sciences, and also for evaluating machine learning fairness. To reduce bias for estimating Natural Direct and Indirect Effects in mediation analysis, we propose a new method called DeepMed that uses deep neural networks (DNNs) to cross-fit the infinite-dimensional nuisance functions in the efficient influence functions. We obtain novel theoretical results that our DeepMed method (1) can achieve semiparametric efficiency bound without imposing sparsity constraints on the DNN architecture and (2) can adapt to certain low dimensional structures of the nuisance functions, significantly advancing the existing literature on DNN-based semiparametric causal inference. Extensive synthetic experiments are conducted to support our findings and also expose the gap between theory and practice. As a proof of concept, we apply DeepMed to analyze two real datasets on machine learning fairness and reach conclusions consistent with previous findings.

* Accepted by NeurIPS 2022

Via

Access Paper or Ask Questions