Picture for Ekaterina Lobacheva

Ekaterina Lobacheva

HSE University, Russia

Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning

Add code
Jun 05, 2025
Viaarxiv icon

SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training

Add code
May 29, 2025
Viaarxiv icon

Where Do Large Learning Rates Lead Us?

Add code
Oct 29, 2024
Viaarxiv icon

Large Learning Rates Improve Generalization: But How Large Are We Talking About?

Add code
Nov 19, 2023
Viaarxiv icon

To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning

Add code
Mar 06, 2023
Viaarxiv icon

Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes

Add code
Sep 08, 2022
Figure 1 for Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes
Figure 2 for Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes
Figure 3 for Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes
Figure 4 for Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes
Viaarxiv icon

Machine Learning Methods for Spectral Efficiency Prediction in Massive MIMO Systems

Add code
Dec 29, 2021
Figure 1 for Machine Learning Methods for Spectral Efficiency Prediction in Massive MIMO Systems
Figure 2 for Machine Learning Methods for Spectral Efficiency Prediction in Massive MIMO Systems
Figure 3 for Machine Learning Methods for Spectral Efficiency Prediction in Massive MIMO Systems
Figure 4 for Machine Learning Methods for Spectral Efficiency Prediction in Massive MIMO Systems
Viaarxiv icon

On the Memorization Properties of Contrastive Learning

Add code
Jul 21, 2021
Figure 1 for On the Memorization Properties of Contrastive Learning
Figure 2 for On the Memorization Properties of Contrastive Learning
Figure 3 for On the Memorization Properties of Contrastive Learning
Figure 4 for On the Memorization Properties of Contrastive Learning
Viaarxiv icon

On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay

Add code
Jun 29, 2021
Figure 1 for On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
Figure 2 for On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
Figure 3 for On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
Figure 4 for On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
Viaarxiv icon

On Power Laws in Deep Ensembles

Add code
Jul 16, 2020
Figure 1 for On Power Laws in Deep Ensembles
Figure 2 for On Power Laws in Deep Ensembles
Figure 3 for On Power Laws in Deep Ensembles
Figure 4 for On Power Laws in Deep Ensembles
Viaarxiv icon