Picture for Jason D. Lee

Jason D. Lee

What Makes a Reward Model a Good Teacher? An Optimization Perspective

Add code
Mar 19, 2025
Viaarxiv icon

Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought

Add code
Feb 28, 2025
Viaarxiv icon

Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension

Add code
Feb 07, 2025
Viaarxiv icon

Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding

Add code
Jan 01, 2025
Figure 1 for Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Figure 2 for Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Figure 3 for Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Figure 4 for Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Viaarxiv icon

Understanding Factual Recall in Transformers via Associative Memories

Add code
Dec 09, 2024
Figure 1 for Understanding Factual Recall in Transformers via Associative Memories
Figure 2 for Understanding Factual Recall in Transformers via Associative Memories
Figure 3 for Understanding Factual Recall in Transformers via Associative Memories
Figure 4 for Understanding Factual Recall in Transformers via Associative Memories
Viaarxiv icon

Anytime Acceleration of Gradient Descent

Add code
Nov 26, 2024
Figure 1 for Anytime Acceleration of Gradient Descent
Figure 2 for Anytime Acceleration of Gradient Descent
Viaarxiv icon

Learning Hierarchical Polynomials of Multiple Nonlinear Features with Three-Layer Networks

Add code
Nov 26, 2024
Viaarxiv icon

Understanding Optimization in Deep Learning with Central Flows

Add code
Oct 31, 2024
Viaarxiv icon

Learning and Transferring Sparse Contextual Bigrams with Linear Transformers

Add code
Oct 30, 2024
Viaarxiv icon

Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis

Add code
Oct 13, 2024
Viaarxiv icon