Picture for Vincent Roulet

Vincent Roulet

Autoregressive Language Models are Secretly Energy-Based Models: Insights into the Lookahead Capabilities of Next-Token Prediction

Add code
Dec 17, 2025
Viaarxiv icon

How far away are truly hyperparameter-free learning algorithms?

Add code
May 29, 2025
Viaarxiv icon

Joint Learning of Energy-based Models and their Partition Function

Add code
Jan 30, 2025
Figure 1 for Joint Learning of Energy-based Models and their Partition Function
Figure 2 for Joint Learning of Energy-based Models and their Partition Function
Figure 3 for Joint Learning of Energy-based Models and their Partition Function
Figure 4 for Joint Learning of Energy-based Models and their Partition Function
Viaarxiv icon

Loss Functions and Operators Generated by f-Divergences

Add code
Jan 30, 2025
Viaarxiv icon

Stepping on the Edge: Curvature Aware Learning Rate Tuners

Add code
Jul 08, 2024
Figure 1 for Stepping on the Edge: Curvature Aware Learning Rate Tuners
Figure 2 for Stepping on the Edge: Curvature Aware Learning Rate Tuners
Figure 3 for Stepping on the Edge: Curvature Aware Learning Rate Tuners
Figure 4 for Stepping on the Edge: Curvature Aware Learning Rate Tuners
Viaarxiv icon

The Elements of Differentiable Programming

Add code
Mar 21, 2024
Figure 1 for The Elements of Differentiable Programming
Figure 2 for The Elements of Differentiable Programming
Figure 3 for The Elements of Differentiable Programming
Figure 4 for The Elements of Differentiable Programming
Viaarxiv icon

On the Interplay Between Stepsize Tuning and Progressive Sharpening

Add code
Dec 07, 2023
Figure 1 for On the Interplay Between Stepsize Tuning and Progressive Sharpening
Figure 2 for On the Interplay Between Stepsize Tuning and Progressive Sharpening
Figure 3 for On the Interplay Between Stepsize Tuning and Progressive Sharpening
Figure 4 for On the Interplay Between Stepsize Tuning and Progressive Sharpening
Viaarxiv icon

Distributionally Robust Optimization with Bias and Variance Reduction

Add code
Oct 21, 2023
Figure 1 for Distributionally Robust Optimization with Bias and Variance Reduction
Figure 2 for Distributionally Robust Optimization with Bias and Variance Reduction
Figure 3 for Distributionally Robust Optimization with Bias and Variance Reduction
Figure 4 for Distributionally Robust Optimization with Bias and Variance Reduction
Viaarxiv icon

Dual Gauss-Newton Directions for Deep Learning

Add code
Aug 17, 2023
Figure 1 for Dual Gauss-Newton Directions for Deep Learning
Figure 2 for Dual Gauss-Newton Directions for Deep Learning
Figure 3 for Dual Gauss-Newton Directions for Deep Learning
Figure 4 for Dual Gauss-Newton Directions for Deep Learning
Viaarxiv icon

Modified Gauss-Newton Algorithms under Noise

Add code
May 18, 2023
Figure 1 for Modified Gauss-Newton Algorithms under Noise
Figure 2 for Modified Gauss-Newton Algorithms under Noise
Figure 3 for Modified Gauss-Newton Algorithms under Noise
Figure 4 for Modified Gauss-Newton Algorithms under Noise
Viaarxiv icon