Picture for Vaishnavh Nagarajan

Vaishnavh Nagarajan

Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning

Add code
May 30, 2024
Figure 1 for Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Figure 2 for Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Figure 3 for Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Figure 4 for Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Viaarxiv icon

The pitfalls of next-token prediction

Add code
Mar 11, 2024
Figure 1 for The pitfalls of next-token prediction
Figure 2 for The pitfalls of next-token prediction
Figure 3 for The pitfalls of next-token prediction
Figure 4 for The pitfalls of next-token prediction
Viaarxiv icon

What do larger image classifiers memorise?

Add code
Oct 09, 2023
Figure 1 for What do larger image classifiers memorise?
Figure 2 for What do larger image classifiers memorise?
Figure 3 for What do larger image classifiers memorise?
Figure 4 for What do larger image classifiers memorise?
Viaarxiv icon

The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning

Add code
Oct 07, 2023
Figure 1 for The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning
Figure 2 for The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning
Figure 3 for The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning
Figure 4 for The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning
Viaarxiv icon

Think before you speak: Training Language Models With Pause Tokens

Add code
Oct 03, 2023
Figure 1 for Think before you speak: Training Language Models With Pause Tokens
Figure 2 for Think before you speak: Training Language Models With Pause Tokens
Figure 3 for Think before you speak: Training Language Models With Pause Tokens
Figure 4 for Think before you speak: Training Language Models With Pause Tokens
Viaarxiv icon

ResMem: Learn what you can and memorize the rest

Add code
Feb 03, 2023
Figure 1 for ResMem: Learn what you can and memorize the rest
Figure 2 for ResMem: Learn what you can and memorize the rest
Figure 3 for ResMem: Learn what you can and memorize the rest
Figure 4 for ResMem: Learn what you can and memorize the rest
Viaarxiv icon

On student-teacher deviations in distillation: does it pay to disobey?

Add code
Jan 30, 2023
Figure 1 for On student-teacher deviations in distillation: does it pay to disobey?
Figure 2 for On student-teacher deviations in distillation: does it pay to disobey?
Figure 3 for On student-teacher deviations in distillation: does it pay to disobey?
Figure 4 for On student-teacher deviations in distillation: does it pay to disobey?
Viaarxiv icon

Explaining generalization in deep learning: progress and fundamental limits

Add code
Oct 17, 2021
Figure 1 for Explaining generalization in deep learning: progress and fundamental limits
Figure 2 for Explaining generalization in deep learning: progress and fundamental limits
Figure 3 for Explaining generalization in deep learning: progress and fundamental limits
Figure 4 for Explaining generalization in deep learning: progress and fundamental limits
Viaarxiv icon

Assessing Generalization of SGD via Disagreement

Add code
Jun 25, 2021
Figure 1 for Assessing Generalization of SGD via Disagreement
Figure 2 for Assessing Generalization of SGD via Disagreement
Figure 3 for Assessing Generalization of SGD via Disagreement
Figure 4 for Assessing Generalization of SGD via Disagreement
Viaarxiv icon

A Learning Theoretic Perspective on Local Explainability

Add code
Nov 02, 2020
Figure 1 for A Learning Theoretic Perspective on Local Explainability
Figure 2 for A Learning Theoretic Perspective on Local Explainability
Viaarxiv icon