Picture for Sanjeev Arora

Sanjeev Arora

Task-Specific Skill Localization in Fine-tuned Language Models

Add code
Feb 13, 2023
Figure 1 for Task-Specific Skill Localization in Fine-tuned Language Models
Figure 2 for Task-Specific Skill Localization in Fine-tuned Language Models
Figure 3 for Task-Specific Skill Localization in Fine-tuned Language Models
Figure 4 for Task-Specific Skill Localization in Fine-tuned Language Models
Viaarxiv icon

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

Add code
Nov 05, 2022
Figure 1 for New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
Figure 2 for New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
Figure 3 for New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
Figure 4 for New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
Viaarxiv icon

A Kernel-Based View of Language Model Fine-Tuning

Add code
Oct 11, 2022
Figure 1 for A Kernel-Based View of Language Model Fine-Tuning
Figure 2 for A Kernel-Based View of Language Model Fine-Tuning
Figure 3 for A Kernel-Based View of Language Model Fine-Tuning
Figure 4 for A Kernel-Based View of Language Model Fine-Tuning
Viaarxiv icon

Understanding Influence Functions and Datamodels via Harmonic Analysis

Add code
Oct 03, 2022
Figure 1 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Figure 2 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Figure 3 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Figure 4 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Viaarxiv icon

Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent

Add code
Jul 08, 2022
Figure 1 for Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Viaarxiv icon

Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction

Add code
Jun 14, 2022
Figure 1 for Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Figure 2 for Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Figure 3 for Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Figure 4 for Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Viaarxiv icon

On the SDEs and Scaling Rules for Adaptive Gradient Algorithms

Add code
May 20, 2022
Figure 1 for On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
Figure 2 for On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
Figure 3 for On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
Figure 4 for On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
Viaarxiv icon

Understanding Gradient Descent on Edge of Stability in Deep Learning

Add code
May 19, 2022
Figure 1 for Understanding Gradient Descent on Edge of Stability in Deep Learning
Figure 2 for Understanding Gradient Descent on Edge of Stability in Deep Learning
Figure 3 for Understanding Gradient Descent on Edge of Stability in Deep Learning
Figure 4 for Understanding Gradient Descent on Edge of Stability in Deep Learning
Viaarxiv icon

Adaptive Gradient Methods with Local Guarantees

Add code
Mar 05, 2022
Figure 1 for Adaptive Gradient Methods with Local Guarantees
Figure 2 for Adaptive Gradient Methods with Local Guarantees
Figure 3 for Adaptive Gradient Methods with Local Guarantees
Figure 4 for Adaptive Gradient Methods with Local Guarantees
Viaarxiv icon

Understanding Contrastive Learning Requires Incorporating Inductive Biases

Add code
Feb 28, 2022
Figure 1 for Understanding Contrastive Learning Requires Incorporating Inductive Biases
Figure 2 for Understanding Contrastive Learning Requires Incorporating Inductive Biases
Figure 3 for Understanding Contrastive Learning Requires Incorporating Inductive Biases
Figure 4 for Understanding Contrastive Learning Requires Incorporating Inductive Biases
Viaarxiv icon