Picture for Sanjeev Arora

Sanjeev Arora

A Quadratic Synchronization Rule for Distributed Deep Learning

Add code
Oct 22, 2023
Viaarxiv icon

A Theory for Emergence of Complex Skills in Language Models

Add code
Jul 29, 2023
Viaarxiv icon

Trainable Transformer in Transformer

Add code
Jul 03, 2023
Viaarxiv icon

Fine-Tuning Language Models with Just Forward Passes

Add code
May 27, 2023
Figure 1 for Fine-Tuning Language Models with Just Forward Passes
Figure 2 for Fine-Tuning Language Models with Just Forward Passes
Figure 3 for Fine-Tuning Language Models with Just Forward Passes
Figure 4 for Fine-Tuning Language Models with Just Forward Passes
Viaarxiv icon

Do Transformers Parse while Predicting the Masked Word?

Add code
Mar 14, 2023
Figure 1 for Do Transformers Parse while Predicting the Masked Word?
Figure 2 for Do Transformers Parse while Predicting the Masked Word?
Figure 3 for Do Transformers Parse while Predicting the Masked Word?
Figure 4 for Do Transformers Parse while Predicting the Masked Word?
Viaarxiv icon

Why does Local SGD Generalize Better than SGD?

Add code
Mar 09, 2023
Figure 1 for Why  does Local SGD Generalize Better than SGD?
Figure 2 for Why  does Local SGD Generalize Better than SGD?
Figure 3 for Why  does Local SGD Generalize Better than SGD?
Figure 4 for Why  does Local SGD Generalize Better than SGD?
Viaarxiv icon

Task-Specific Skill Localization in Fine-tuned Language Models

Add code
Feb 13, 2023
Figure 1 for Task-Specific Skill Localization in Fine-tuned Language Models
Figure 2 for Task-Specific Skill Localization in Fine-tuned Language Models
Figure 3 for Task-Specific Skill Localization in Fine-tuned Language Models
Figure 4 for Task-Specific Skill Localization in Fine-tuned Language Models
Viaarxiv icon

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

Add code
Nov 05, 2022
Figure 1 for New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
Figure 2 for New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
Figure 3 for New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
Figure 4 for New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
Viaarxiv icon

A Kernel-Based View of Language Model Fine-Tuning

Add code
Oct 11, 2022
Figure 1 for A Kernel-Based View of Language Model Fine-Tuning
Figure 2 for A Kernel-Based View of Language Model Fine-Tuning
Figure 3 for A Kernel-Based View of Language Model Fine-Tuning
Figure 4 for A Kernel-Based View of Language Model Fine-Tuning
Viaarxiv icon

Understanding Influence Functions and Datamodels via Harmonic Analysis

Add code
Oct 03, 2022
Figure 1 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Figure 2 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Figure 3 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Figure 4 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Viaarxiv icon