Picture for Sashank J. Reddi

Sashank J. Reddi

Efficient Document Ranking with Learnable Late Interactions

Add code
Jun 25, 2024
Viaarxiv icon

Landscape-Aware Growing: The Power of a Little LAG

Add code
Jun 04, 2024
Viaarxiv icon

Depth Dependence of $μ$P Learning Rates in ReLU MLPs

Add code
May 13, 2023
Viaarxiv icon

Differentially Private Adaptive Optimization with Delayed Preconditioners

Add code
Dec 01, 2022
Figure 1 for Differentially Private Adaptive Optimization with Delayed Preconditioners
Figure 2 for Differentially Private Adaptive Optimization with Delayed Preconditioners
Figure 3 for Differentially Private Adaptive Optimization with Delayed Preconditioners
Figure 4 for Differentially Private Adaptive Optimization with Delayed Preconditioners
Viaarxiv icon

On the Algorithmic Stability and Generalization of Adaptive Optimization Methods

Add code
Nov 08, 2022
Figure 1 for On the Algorithmic Stability and Generalization of Adaptive Optimization Methods
Figure 2 for On the Algorithmic Stability and Generalization of Adaptive Optimization Methods
Figure 3 for On the Algorithmic Stability and Generalization of Adaptive Optimization Methods
Figure 4 for On the Algorithmic Stability and Generalization of Adaptive Optimization Methods
Viaarxiv icon

Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers

Add code
Oct 12, 2022
Figure 1 for Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers
Figure 2 for Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers
Figure 3 for Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers
Figure 4 for Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers
Viaarxiv icon

Private Adaptive Optimization with Side Information

Add code
Feb 12, 2022
Figure 1 for Private Adaptive Optimization with Side Information
Figure 2 for Private Adaptive Optimization with Side Information
Figure 3 for Private Adaptive Optimization with Side Information
Figure 4 for Private Adaptive Optimization with Side Information
Viaarxiv icon

Robust Training of Neural Networks using Scale Invariant Architectures

Add code
Feb 02, 2022
Figure 1 for Robust Training of Neural Networks using Scale Invariant Architectures
Figure 2 for Robust Training of Neural Networks using Scale Invariant Architectures
Figure 3 for Robust Training of Neural Networks using Scale Invariant Architectures
Figure 4 for Robust Training of Neural Networks using Scale Invariant Architectures
Viaarxiv icon

A Field Guide to Federated Optimization

Add code
Jul 14, 2021
Figure 1 for A Field Guide to Federated Optimization
Figure 2 for A Field Guide to Federated Optimization
Figure 3 for A Field Guide to Federated Optimization
Figure 4 for A Field Guide to Federated Optimization
Viaarxiv icon

Distilling Double Descent

Add code
Feb 13, 2021
Figure 1 for Distilling Double Descent
Figure 2 for Distilling Double Descent
Figure 3 for Distilling Double Descent
Figure 4 for Distilling Double Descent
Viaarxiv icon