Alert button
Picture for Mikhail Belkin

Mikhail Belkin

Alert button

Average gradient outer product as a mechanism for deep neural collapse

Add code
Bookmark button
Alert button
Feb 21, 2024
Daniel Beaglehole, Peter Súkeník, Marco Mondelli, Mikhail Belkin

Viaarxiv icon

Unmemorization in Large Language Models via Self-Distillation and Deliberate Imagination

Add code
Bookmark button
Alert button
Feb 15, 2024
Yijiang River Dong, Hongzhou Lin, Mikhail Belkin, Ramon Huerta, Ivan Vulić

Viaarxiv icon

Linear Recursive Feature Machines provably recover low-rank matrices

Add code
Bookmark button
Alert button
Jan 09, 2024
Adityanarayanan Radhakrishnan, Mikhail Belkin, Dmitriy Drusvyatskiy

Viaarxiv icon

On the Nystrom Approximation for Preconditioning in Kernel Machines

Add code
Bookmark button
Alert button
Dec 06, 2023
Amirhesam Abedsoltan, Mikhail Belkin, Parthe Pandit, Luis Rademacher

Viaarxiv icon

More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory

Add code
Bookmark button
Alert button
Nov 27, 2023
James B. Simon, Dhruva Karkada, Nikhil Ghosh, Mikhail Belkin

Viaarxiv icon

Mechanism of feature learning in convolutional neural networks

Add code
Bookmark button
Alert button
Sep 01, 2023
Daniel Beaglehole, Adityanarayanan Radhakrishnan, Parthe Pandit, Mikhail Belkin

Figure 1 for Mechanism of feature learning in convolutional neural networks
Figure 2 for Mechanism of feature learning in convolutional neural networks
Figure 3 for Mechanism of feature learning in convolutional neural networks
Figure 4 for Mechanism of feature learning in convolutional neural networks
Viaarxiv icon

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

Add code
Bookmark button
Alert button
Jun 07, 2023
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

Figure 1 for Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Figure 2 for Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Figure 3 for Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Figure 4 for Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Viaarxiv icon

Aiming towards the minimizers: fast convergence of SGD for overparametrized problems

Add code
Bookmark button
Alert button
Jun 05, 2023
Chaoyue Liu, Dmitriy Drusvyatskiy, Mikhail Belkin, Damek Davis, Yi-An Ma

Figure 1 for Aiming towards the minimizers: fast convergence of SGD for overparametrized problems
Figure 2 for Aiming towards the minimizers: fast convergence of SGD for overparametrized problems
Figure 3 for Aiming towards the minimizers: fast convergence of SGD for overparametrized problems
Viaarxiv icon

On Emergence of Clean-Priority Learning in Early Stopped Neural Networks

Add code
Bookmark button
Alert button
Jun 05, 2023
Chaoyue Liu, Amirhesam Abedsoltan, Mikhail Belkin

Figure 1 for On Emergence of Clean-Priority Learning in Early Stopped Neural Networks
Figure 2 for On Emergence of Clean-Priority Learning in Early Stopped Neural Networks
Figure 3 for On Emergence of Clean-Priority Learning in Early Stopped Neural Networks
Figure 4 for On Emergence of Clean-Priority Learning in Early Stopped Neural Networks
Viaarxiv icon

Cut your Losses with Squentropy

Add code
Bookmark button
Alert button
Feb 08, 2023
Like Hui, Mikhail Belkin, Stephen Wright

Figure 1 for Cut your Losses with Squentropy
Figure 2 for Cut your Losses with Squentropy
Figure 3 for Cut your Losses with Squentropy
Figure 4 for Cut your Losses with Squentropy
Viaarxiv icon