Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaohan Zhu

Overfitting and Generalizing with (PAC) Bayesian Prediction in Noisy Binary Classification

Mar 23, 2026

Xiaohan Zhu, Mesrob I. Ohannessian, Nathan Srebro

Abstract:We consider a PAC-Bayes type learning rule for binary classification, balancing the training error of a randomized ''posterior'' predictor with its KL divergence to a pre-specified ''prior''. This can be seen as an extension of a modified two-part-code Minimum Description Length (MDL) learning rule, to continuous priors and randomized predictions. With a balancing parameter of $λ=1$ this learning rule recovers an (empirical) Bayes posterior and a modified variant recovers the profile posterior, linking with standard Bayesian prediction (up to the treatment of the single-parameter noise level). However, from a risk-minimization prediction perspective, this Bayesian predictor overfits and can lead to non-vanishing excess loss in the agnostic case. Instead a choice of $λ\gg 1$, which can be seen as using a sample-size-dependent-prior, ensures uniformly vanishing excess loss even in the agnostic case. We precisely characterize the effect of under-regularizing (and over-regularizing) as a function of the balance parameter $λ$, understanding the regimes in which this under-regularization is tempered or catastrophic. This work extends previous work by Zhu and Srebro [2025] that considered only discrete priors to PAC Bayes type learning rules and, through their rigorous Bayesian interpretation, to Bayesian prediction more generally.

Via

Access Paper or Ask Questions

Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification

Mar 03, 2025

Xiaohan Zhu, Nathan Srebro

Figure 1 for Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification

Figure 2 for Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification

Figure 3 for Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification

Figure 4 for Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification

Abstract:We provide a complete characterization of the entire regularization curve of a modified two-part-code Minimum Description Length (MDL) learning rule for binary classification, based on an arbitrary prior or description language. \citet{GL} previously established the lack of asymptotic consistency, from an agnostic PAC (frequentist worst case) perspective, of the MDL rule with a penalty parameter of $\lambda=1$, suggesting that it underegularizes. Driven by interest in understanding how benign or catastrophic under-regularization and overfitting might be, we obtain a precise quantitative description of the worst case limiting error as a function of the regularization parameter $\lambda$ and noise level (or approximation error), significantly tightening the analysis of \citeauthor{GL} for $\lambda=1$ and extending it to all other choices of $\lambda$.

Via

Access Paper or Ask Questions

Tight Bounds on the Binomial CDF, and the Minimum of i.i.d Binomials, in terms of KL-Divergence

Feb 25, 2025

Xiaohan Zhu, Mesrob I. Ohannessian, Nathan Srebro

Abstract:We provide finite sample upper and lower bounds on the Binomial tail probability which are a direct application of Sanov's theorem. We then use these to obtain high probability upper and lower bounds on the minimum of i.i.d. Binomial random variables. Both bounds are finite sample, asymptotically tight, and expressed in terms of the KL-divergence.

Via

Access Paper or Ask Questions

A three-dimensional force estimation method for the cable-driven soft robot based on monocular images

Sep 12, 2024

Xiaohan Zhu, Ran Bu, Zhen Li, Fan Xu, Hesheng Wang

Figure 1 for A three-dimensional force estimation method for the cable-driven soft robot based on monocular images

Figure 2 for A three-dimensional force estimation method for the cable-driven soft robot based on monocular images

Figure 3 for A three-dimensional force estimation method for the cable-driven soft robot based on monocular images

Figure 4 for A three-dimensional force estimation method for the cable-driven soft robot based on monocular images

Abstract:Soft manipulators are known for their superiority in coping with high-safety-demanding interaction tasks, e.g., robot-assisted surgeries, elderly caring, etc. Yet the challenges residing in real-time contact feedback have hindered further applications in precise manipulation. This paper proposes an end-to-end network to estimate the 3D contact force of the soft robot, with the aim of enhancing its capabilities in interactive tasks. The presented method features directly utilizing monocular images fused with multidimensional actuation information as the network inputs. This approach simplifies the preprocessing of raw data compared to related studies that utilize 3D shape information for network inputs, consequently reducing configuration reconstruction errors. The unified feature representation module is devised to elevate low-dimensional features from the system's actuation signals to the same level as image features, facilitating smoother integration of multimodal information. The proposed method has been experimentally validated in the soft robot testbed, achieving satisfying accuracy in 3D force estimation (with a mean relative error of 0.84% compared to the best-reported result of 2.2% in the related works).

Via

Access Paper or Ask Questions