Picture for Wu Lin

Wu Lin

Understanding and Improving the Shampoo Optimizer via Kullback-Leibler Minimization

Add code
Sep 03, 2025
Viaarxiv icon

Spectral-factorized Positive-definite Curvature Learning for NN Training

Add code
Feb 10, 2025
Viaarxiv icon

Training Data Attribution via Approximate Unrolled Differentiation

Add code
May 21, 2024
Figure 1 for Training Data Attribution via Approximate Unrolled Differentiation
Figure 2 for Training Data Attribution via Approximate Unrolled Differentiation
Figure 3 for Training Data Attribution via Approximate Unrolled Differentiation
Figure 4 for Training Data Attribution via Approximate Unrolled Differentiation
Viaarxiv icon

Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective

Add code
Feb 13, 2024
Figure 1 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Figure 2 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Figure 3 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Figure 4 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Viaarxiv icon

Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets

Add code
Dec 16, 2023
Viaarxiv icon

Simplifying Momentum-based Riemannian Submanifold Optimization

Add code
Feb 20, 2023
Figure 1 for Simplifying Momentum-based Riemannian Submanifold Optimization
Figure 2 for Simplifying Momentum-based Riemannian Submanifold Optimization
Figure 3 for Simplifying Momentum-based Riemannian Submanifold Optimization
Figure 4 for Simplifying Momentum-based Riemannian Submanifold Optimization
Viaarxiv icon

Structured second-order methods via natural gradient descent

Add code
Jul 22, 2021
Figure 1 for Structured second-order methods via natural gradient descent
Figure 2 for Structured second-order methods via natural gradient descent
Viaarxiv icon

Tractable structured natural gradient descent using local parameterizations

Add code
Mar 04, 2021
Figure 1 for Tractable structured natural gradient descent using local parameterizations
Figure 2 for Tractable structured natural gradient descent using local parameterizations
Figure 3 for Tractable structured natural gradient descent using local parameterizations
Figure 4 for Tractable structured natural gradient descent using local parameterizations
Viaarxiv icon

Handling the Positive-Definite Constraint in the Bayesian Learning Rule

Add code
Mar 08, 2020
Figure 1 for Handling the Positive-Definite Constraint in the Bayesian Learning Rule
Figure 2 for Handling the Positive-Definite Constraint in the Bayesian Learning Rule
Figure 3 for Handling the Positive-Definite Constraint in the Bayesian Learning Rule
Figure 4 for Handling the Positive-Definite Constraint in the Bayesian Learning Rule
Viaarxiv icon

Stein's Lemma for the Reparameterization Trick with Exponential Family Mixtures

Add code
Oct 29, 2019
Viaarxiv icon