Picture for Andrew Gordon Wilson

Andrew Gordon Wilson

Efficiently Generating Correlated Sample Paths from Multi-step Time Series Foundation Models

Add code
Oct 02, 2025
Viaarxiv icon

Customizing the Inductive Biases of Softmax Attention using Structured Matrices

Add code
Sep 09, 2025
Viaarxiv icon

Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful

Add code
Jul 09, 2025
Viaarxiv icon

Out-of-Distribution Detection Methods Answer the Wrong Questions

Add code
Jul 02, 2025
Viaarxiv icon

Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks

Add code
Jul 02, 2025
Viaarxiv icon

Training Flexible Models of Genetic Variant Effects from Functional Annotations using Accelerated Linear Algebra

Add code
Jun 24, 2025
Viaarxiv icon

Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion

Add code
Jun 10, 2025
Viaarxiv icon

Compute-Optimal LLMs Provably Generalize Better With Scale

Add code
Apr 21, 2025
Viaarxiv icon

When Should We Orchestrate Multiple Agents?

Add code
Mar 17, 2025
Viaarxiv icon

Deep Learning is Not So Mysterious or Different

Add code
Mar 03, 2025
Figure 1 for Deep Learning is Not So Mysterious or Different
Figure 2 for Deep Learning is Not So Mysterious or Different
Figure 3 for Deep Learning is Not So Mysterious or Different
Figure 4 for Deep Learning is Not So Mysterious or Different
Viaarxiv icon