Picture for Denny Wu

Denny Wu

Learning Compositional Functions with Transformers from Easy-to-Hard Data

Add code
May 29, 2025
Viaarxiv icon

Emergence and scaling laws in SGD learning of shallow neural networks

Add code
Apr 28, 2025
Viaarxiv icon

Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time

Add code
Apr 17, 2025
Viaarxiv icon

When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective

Add code
Mar 14, 2025
Viaarxiv icon

Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation

Add code
Feb 02, 2025
Viaarxiv icon

Pretrained transformer efficiently learns low-dimensional target functions in-context

Add code
Nov 04, 2024
Viaarxiv icon

Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics

Add code
Aug 14, 2024
Figure 1 for Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Figure 2 for Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Viaarxiv icon

Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations

Add code
Jun 17, 2024
Viaarxiv icon

Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit

Add code
Jun 03, 2024
Viaarxiv icon

Nonlinear spiked covariance matrices and signal propagation in deep neural networks

Add code
Feb 15, 2024
Viaarxiv icon