Picture for Andrey Gromov

Andrey Gromov

Grokking Modular Polynomials

Jun 05, 2024
Viaarxiv icon

Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks

Add code
Jun 04, 2024
Viaarxiv icon

Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data

Apr 01, 2024
Viaarxiv icon

The Unreasonable Ineffectiveness of the Deeper Layers

Mar 26, 2024
Figure 1 for The Unreasonable Ineffectiveness of the Deeper Layers
Figure 2 for The Unreasonable Ineffectiveness of the Deeper Layers
Figure 3 for The Unreasonable Ineffectiveness of the Deeper Layers
Figure 4 for The Unreasonable Ineffectiveness of the Deeper Layers
Viaarxiv icon

Bridging Associative Memory and Probabilistic Modeling

Feb 15, 2024
Viaarxiv icon

To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets

Add code
Oct 19, 2023
Figure 1 for To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets
Figure 2 for To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets
Figure 3 for To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets
Figure 4 for To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets
Viaarxiv icon

Grokking modular arithmetic

Jan 06, 2023
Figure 1 for Grokking modular arithmetic
Figure 2 for Grokking modular arithmetic
Figure 3 for Grokking modular arithmetic
Figure 4 for Grokking modular arithmetic
Viaarxiv icon

AutoInit: Automatic Initialization via Jacobian Tuning

Add code
Jun 27, 2022
Figure 1 for AutoInit: Automatic Initialization via Jacobian Tuning
Figure 2 for AutoInit: Automatic Initialization via Jacobian Tuning
Figure 3 for AutoInit: Automatic Initialization via Jacobian Tuning
Figure 4 for AutoInit: Automatic Initialization via Jacobian Tuning
Viaarxiv icon

Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm

Nov 30, 2021
Figure 1 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Figure 2 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Figure 3 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Figure 4 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Viaarxiv icon