Picture for Boris Hanin

Boris Hanin

Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit

Add code
Sep 28, 2023
Figure 1 for Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
Figure 2 for Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
Figure 3 for Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
Figure 4 for Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
Viaarxiv icon

Les Houches Lectures on Deep Learning at Large & Infinite Width

Add code
Sep 08, 2023
Viaarxiv icon

Quantitative CLTs in Deep Neural Networks

Add code
Jul 21, 2023
Viaarxiv icon

Principles for Initialization and Architecture Selection in Graph Neural Networks with ReLU Activations

Add code
Jun 20, 2023
Viaarxiv icon

Depth Dependence of $μ$P Learning Rates in ReLU MLPs

Add code
May 13, 2023
Viaarxiv icon

Bayesian Interpolation with Deep Linear Networks

Add code
Jan 02, 2023
Viaarxiv icon

Maximal Initial Learning Rates in Deep ReLU Networks

Add code
Dec 14, 2022
Figure 1 for Maximal Initial Learning Rates in Deep ReLU Networks
Figure 2 for Maximal Initial Learning Rates in Deep ReLU Networks
Figure 3 for Maximal Initial Learning Rates in Deep ReLU Networks
Figure 4 for Maximal Initial Learning Rates in Deep ReLU Networks
Viaarxiv icon

Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis

Add code
May 11, 2022
Figure 1 for Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
Figure 2 for Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
Figure 3 for Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
Figure 4 for Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
Viaarxiv icon

Correlation Functions in Random Fully Connected Neural Networks at Finite Width

Add code
Apr 03, 2022
Viaarxiv icon

Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions

Add code
Sep 27, 2021
Figure 1 for Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions
Figure 2 for Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions
Figure 3 for Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions
Figure 4 for Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions
Viaarxiv icon