Picture for Noam Levi

Noam Levi

Internal Data Repetition Destroys Language Models

Add code
Jun 23, 2026
Viaarxiv icon

Statistical Properties of Training & Generalization

Add code
Jun 18, 2026
Viaarxiv icon

Generative models on phase space

Add code
Apr 02, 2026
Viaarxiv icon

The Implicit Bias of Logit Regularization

Add code
Feb 13, 2026
Viaarxiv icon

More Bang for the Buck: Improving the Inference of Large Language Models at a Fixed Budget using Reset and Discard (ReD)

Add code
Jan 29, 2026
Viaarxiv icon

Learning Shrinks the Hard Tail: Training-Dependent Inference Scaling in a Solvable Linear Model

Add code
Jan 07, 2026
Viaarxiv icon

A Simple Model of Inference Scaling Laws

Add code
Oct 21, 2024
Figure 1 for A Simple Model of Inference Scaling Laws
Figure 2 for A Simple Model of Inference Scaling Laws
Figure 3 for A Simple Model of Inference Scaling Laws
Figure 4 for A Simple Model of Inference Scaling Laws
Viaarxiv icon

Grokking at the Edge of Linear Separability

Add code
Oct 06, 2024
Figure 1 for Grokking at the Edge of Linear Separability
Figure 2 for Grokking at the Edge of Linear Separability
Figure 3 for Grokking at the Edge of Linear Separability
Figure 4 for Grokking at the Edge of Linear Separability
Viaarxiv icon

Classifying Overlapping Gaussian Mixtures in High Dimensions: From Optimal Classifiers to Neural Nets

Add code
May 28, 2024
Figure 1 for Classifying Overlapping Gaussian Mixtures in High Dimensions: From Optimal Classifiers to Neural Nets
Figure 2 for Classifying Overlapping Gaussian Mixtures in High Dimensions: From Optimal Classifiers to Neural Nets
Figure 3 for Classifying Overlapping Gaussian Mixtures in High Dimensions: From Optimal Classifiers to Neural Nets
Figure 4 for Classifying Overlapping Gaussian Mixtures in High Dimensions: From Optimal Classifiers to Neural Nets
Viaarxiv icon

Decoupled Weight Decay for Any $p$ Norm

Add code
Apr 16, 2024
Figure 1 for Decoupled Weight Decay for Any $p$ Norm
Figure 2 for Decoupled Weight Decay for Any $p$ Norm
Figure 3 for Decoupled Weight Decay for Any $p$ Norm
Figure 4 for Decoupled Weight Decay for Any $p$ Norm
Viaarxiv icon