Picture for Eran Malach

Eran Malach

Annotations Mitigate Post-Training Mode Collapse

Add code
May 11, 2026
Viaarxiv icon

LLM-ERM: Sample-Efficient Program Learning via LLM-Guided Search

Add code
Oct 16, 2025
Viaarxiv icon

A Taxonomy of Transcendence

Add code
Aug 25, 2025
Viaarxiv icon

Decomposing Elements of Problem Solving: What "Math" Does RL Teach?

Add code
May 28, 2025
Viaarxiv icon

Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones

Add code
May 27, 2025
Figure 1 for Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones
Figure 2 for Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones
Figure 3 for Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones
Figure 4 for Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones
Viaarxiv icon

The Power of Random Features and the Limits of Distribution-Free Gradient Descent

Add code
May 15, 2025
Viaarxiv icon

Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining

Add code
Apr 10, 2025
Figure 1 for Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Figure 2 for Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Figure 3 for Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Figure 4 for Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Viaarxiv icon

To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning

Add code
Apr 09, 2025
Figure 1 for To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning
Figure 2 for To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning
Figure 3 for To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning
Figure 4 for To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning
Viaarxiv icon

The Role of Sparsity for Length Generalization in Transformers

Add code
Feb 24, 2025
Figure 1 for The Role of Sparsity for Length Generalization in Transformers
Figure 2 for The Role of Sparsity for Length Generalization in Transformers
Figure 3 for The Role of Sparsity for Length Generalization in Transformers
Figure 4 for The Role of Sparsity for Length Generalization in Transformers
Viaarxiv icon

Loss-to-Loss Prediction: Scaling Laws for All Datasets

Add code
Nov 19, 2024
Figure 1 for Loss-to-Loss Prediction: Scaling Laws for All Datasets
Figure 2 for Loss-to-Loss Prediction: Scaling Laws for All Datasets
Figure 3 for Loss-to-Loss Prediction: Scaling Laws for All Datasets
Figure 4 for Loss-to-Loss Prediction: Scaling Laws for All Datasets
Viaarxiv icon