Picture for Elliot Paquette

Elliot Paquette

Phases of Muon: When Muon Eclipses SignSGD

Add code
May 10, 2026
Viaarxiv icon

Spectral Lens: Activation and Gradient Spectra as Diagnostics of LLM Optimization

Add code
May 07, 2026
Viaarxiv icon

Power-Law Spectrum of the Random Feature Model

Add code
Mar 15, 2026
Viaarxiv icon

Logarithmic-time Schedules for Scaling Language Models with Momentum

Add code
Feb 05, 2026
Viaarxiv icon

Eigenvalue distribution of the Neural Tangent Kernel in the quadratic scaling

Add code
Aug 27, 2025
Viaarxiv icon

Dimension-adapted Momentum Outscales SGD

Add code
May 22, 2025
Viaarxiv icon

Exact Risk Curves of signSGD in High-Dimensions: Quantifying Preconditioning and Noise-Compression Effects

Add code
Nov 19, 2024
Figure 1 for Exact Risk Curves of signSGD in High-Dimensions: Quantifying Preconditioning and Noise-Compression Effects
Figure 2 for Exact Risk Curves of signSGD in High-Dimensions: Quantifying Preconditioning and Noise-Compression Effects
Figure 3 for Exact Risk Curves of signSGD in High-Dimensions: Quantifying Preconditioning and Noise-Compression Effects
Figure 4 for Exact Risk Curves of signSGD in High-Dimensions: Quantifying Preconditioning and Noise-Compression Effects
Viaarxiv icon

A Clipped Trip: the Dynamics of SGD with Gradient Clipping in High-Dimensions

Add code
Jun 17, 2024
Viaarxiv icon

The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms

Add code
May 30, 2024
Viaarxiv icon

4+3 Phases of Compute-Optimal Neural Scaling Laws

Add code
May 23, 2024
Figure 1 for 4+3 Phases of Compute-Optimal Neural Scaling Laws
Figure 2 for 4+3 Phases of Compute-Optimal Neural Scaling Laws
Figure 3 for 4+3 Phases of Compute-Optimal Neural Scaling Laws
Figure 4 for 4+3 Phases of Compute-Optimal Neural Scaling Laws
Viaarxiv icon