Picture for Alexandru Meterez

Alexandru Meterez

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning

Add code
Feb 27, 2024
Figure 1 for Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning
Figure 2 for Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning
Figure 3 for Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning
Figure 4 for Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning
Viaarxiv icon

Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion

Add code
Oct 03, 2023
Figure 1 for Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
Figure 2 for Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
Figure 3 for Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
Figure 4 for Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
Viaarxiv icon