Picture for Felix Dangel

Felix Dangel

Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning

Add code
Jun 05, 2024
Viaarxiv icon

Lowering PyTorch's Memory Consumption for Selective Differentiation

Add code
Apr 15, 2024
Figure 1 for Lowering PyTorch's Memory Consumption for Selective Differentiation
Figure 2 for Lowering PyTorch's Memory Consumption for Selective Differentiation
Figure 3 for Lowering PyTorch's Memory Consumption for Selective Differentiation
Figure 4 for Lowering PyTorch's Memory Consumption for Selective Differentiation
Viaarxiv icon

Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective

Add code
Feb 13, 2024
Viaarxiv icon

Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets

Add code
Dec 16, 2023
Figure 1 for Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets
Figure 2 for Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets
Figure 3 for Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets
Figure 4 for Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets
Viaarxiv icon

On the Disconnect Between Theory and Practice of Overparametrized Neural Networks

Add code
Sep 29, 2023
Figure 1 for On the Disconnect Between Theory and Practice of Overparametrized Neural Networks
Figure 2 for On the Disconnect Between Theory and Practice of Overparametrized Neural Networks
Figure 3 for On the Disconnect Between Theory and Practice of Overparametrized Neural Networks
Figure 4 for On the Disconnect Between Theory and Practice of Overparametrized Neural Networks
Viaarxiv icon

Convolutions Through the Lens of Tensor Networks

Add code
Jul 05, 2023
Figure 1 for Convolutions Through the Lens of Tensor Networks
Figure 2 for Convolutions Through the Lens of Tensor Networks
Figure 3 for Convolutions Through the Lens of Tensor Networks
Figure 4 for Convolutions Through the Lens of Tensor Networks
Viaarxiv icon

The Geometry of Neural Nets' Parameter Spaces Under Reparametrization

Add code
Feb 14, 2023
Figure 1 for The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Figure 2 for The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Figure 3 for The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Figure 4 for The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Viaarxiv icon

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Add code
Jun 04, 2021
Figure 1 for ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure
Figure 2 for ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure
Figure 3 for ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure
Viaarxiv icon

Cockpit: A Practical Debugging Tool for Training Deep Neural Networks

Add code
Feb 12, 2021
Figure 1 for Cockpit: A Practical Debugging Tool for Training Deep Neural Networks
Figure 2 for Cockpit: A Practical Debugging Tool for Training Deep Neural Networks
Figure 3 for Cockpit: A Practical Debugging Tool for Training Deep Neural Networks
Figure 4 for Cockpit: A Practical Debugging Tool for Training Deep Neural Networks
Viaarxiv icon

BackPACK: Packing more into backprop

Add code
Feb 15, 2020
Figure 1 for BackPACK: Packing more into backprop
Figure 2 for BackPACK: Packing more into backprop
Figure 3 for BackPACK: Packing more into backprop
Figure 4 for BackPACK: Packing more into backprop
Viaarxiv icon