Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Markus Engelund Mathiasen

The Sample Complexity of Replicable Realizable PAC Learning

Feb 23, 2026

Kasper Green Larsen, Markus Engelund Mathiasen, Chirag Pabbaraju, Clement Svendsen

Abstract:In this paper, we consider the problem of replicable realizable PAC learning. We construct a particularly hard learning problem and show a sample complexity lower bound with a close to $(\log|H|)^{3/2}$ dependence on the size of the hypothesis class $H$. Our proof uses several novel techniques and works by defining a particular Cayley graph associated with $H$ and analyzing a suitable random walk on this graph by examining the spectral properties of its adjacency matrix. Furthermore, we show an almost matching upper bound for the lower bound instance, meaning if a stronger lower bound exists, one would have to consider a different instance of the problem.

Via

Access Paper or Ask Questions

Improved Replicable Boosting with Majority-of-Majorities

Jan 30, 2025

Kasper Green Larsen, Markus Engelund Mathiasen, Clement Svendsen

Abstract:We introduce a new replicable boosting algorithm which significantly improves the sample complexity compared to previous algorithms. The algorithm works by doing two layers of majority voting, using an improved version of the replicable boosting algorithm introduced by Impagliazzo et al. [2022] in the bottom layer.

* 13 pages

Via

Access Paper or Ask Questions

The Many Faces of Optimal Weak-to-Strong Learning

Aug 30, 2024

Mikael Møller Høgsgaard, Kasper Green Larsen, Markus Engelund Mathiasen

Abstract:Boosting is an extremely successful idea, allowing one to combine multiple low accuracy classifiers into a much more accurate voting classifier. In this work, we present a new and surprisingly simple Boosting algorithm that obtains a provably optimal sample complexity. Sample optimal Boosting algorithms have only recently been developed, and our new algorithm has the fastest runtime among all such algorithms and is the simplest to describe: Partition your training data into 5 disjoint pieces of equal size, run AdaBoost on each, and combine the resulting classifiers via a majority vote. In addition to this theoretical contribution, we also perform the first empirical comparison of the proposed sample optimal Boosting algorithms. Our pilot empirical study suggests that our new algorithm might outperform previous algorithms on large data sets.

Via

Access Paper or Ask Questions