Alert button
Picture for Libin Zhu

Libin Zhu

Alert button

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

Add code
Bookmark button
Alert button
Jun 07, 2023
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

Figure 1 for Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Figure 2 for Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Figure 3 for Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Figure 4 for Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Viaarxiv icon

Restricted Strong Convexity of Deep Learning Models with Smooth Activations

Add code
Bookmark button
Alert button
Sep 29, 2022
Arindam Banerjee, Pedro Cisneros-Velarde, Libin Zhu, Mikhail Belkin

Figure 1 for Restricted Strong Convexity of Deep Learning Models with Smooth Activations
Figure 2 for Restricted Strong Convexity of Deep Learning Models with Smooth Activations
Viaarxiv icon

A note on Linear Bottleneck networks and their Transition to Multilinearity

Add code
Bookmark button
Alert button
Jun 30, 2022
Libin Zhu, Parthe Pandit, Mikhail Belkin

Figure 1 for A note on Linear Bottleneck networks and their Transition to Multilinearity
Figure 2 for A note on Linear Bottleneck networks and their Transition to Multilinearity
Viaarxiv icon

Quadratic models for understanding neural network dynamics

Add code
Bookmark button
Alert button
May 24, 2022
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

Figure 1 for Quadratic models for understanding neural network dynamics
Figure 2 for Quadratic models for understanding neural network dynamics
Figure 3 for Quadratic models for understanding neural network dynamics
Figure 4 for Quadratic models for understanding neural network dynamics
Viaarxiv icon

Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture

Add code
Bookmark button
Alert button
May 24, 2022
Libin Zhu, Chaoyue Liu, Mikhail Belkin

Figure 1 for Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Viaarxiv icon

Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models

Add code
Bookmark button
Alert button
Mar 10, 2022
Chaoyue Liu, Libin Zhu, Mikhail Belkin

Figure 1 for Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models
Figure 2 for Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models
Figure 3 for Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models
Viaarxiv icon

On the linearity of large non-linear models: when and why the tangent kernel is constant

Add code
Bookmark button
Alert button
Oct 02, 2020
Chaoyue Liu, Libin Zhu, Mikhail Belkin

Figure 1 for On the linearity of large non-linear models: when and why the tangent kernel is constant
Figure 2 for On the linearity of large non-linear models: when and why the tangent kernel is constant
Figure 3 for On the linearity of large non-linear models: when and why the tangent kernel is constant
Viaarxiv icon

Toward a theory of optimization for over-parameterized systems of non-linear equations: the lessons of deep learning

Add code
Bookmark button
Alert button
Feb 29, 2020
Chaoyue Liu, Libin Zhu, Mikhail Belkin

Figure 1 for Toward a theory of optimization for over-parameterized systems of non-linear equations: the lessons of deep learning
Figure 2 for Toward a theory of optimization for over-parameterized systems of non-linear equations: the lessons of deep learning
Figure 3 for Toward a theory of optimization for over-parameterized systems of non-linear equations: the lessons of deep learning
Viaarxiv icon