Picture for Jason D. Lee

Jason D. Lee

Coverage Improvement and Fast Convergence of On-policy Preference Learning

Add code
Jan 13, 2026
Viaarxiv icon

Neural Networks Learn Generic Multi-Index Models Near Information-Theoretic Limit

Add code
Nov 19, 2025
Viaarxiv icon

Quantitative Bounds for Length Generalization in Transformers

Add code
Oct 30, 2025
Viaarxiv icon

What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains

Add code
Aug 10, 2025
Viaarxiv icon

The Generative Leap: Sharp Sample Complexity for Efficiently Learning Gaussian Multi-Index Models

Add code
Jun 05, 2025
Viaarxiv icon

Learning Compositional Functions with Transformers from Easy-to-Hard Data

Add code
May 29, 2025
Viaarxiv icon

Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Add code
May 27, 2025
Viaarxiv icon

Emergence and scaling laws in SGD learning of shallow neural networks

Add code
Apr 28, 2025
Viaarxiv icon

What Makes a Reward Model a Good Teacher? An Optimization Perspective

Add code
Mar 19, 2025
Viaarxiv icon

Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought

Add code
Feb 28, 2025
Figure 1 for Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
Figure 2 for Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
Figure 3 for Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
Viaarxiv icon