Picture for Sanjiv Kumar

Sanjiv Kumar

Google Research

It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models

Add code
Oct 13, 2023
Figure 1 for It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models
Figure 2 for It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models
Figure 3 for It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models
Figure 4 for It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models
Viaarxiv icon

DistillSpec: Improving Speculative Decoding via Knowledge Distillation

Add code
Oct 12, 2023
Figure 1 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 2 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 3 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 4 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Viaarxiv icon

What do larger image classifiers memorise?

Add code
Oct 09, 2023
Figure 1 for What do larger image classifiers memorise?
Figure 2 for What do larger image classifiers memorise?
Figure 3 for What do larger image classifiers memorise?
Figure 4 for What do larger image classifiers memorise?
Viaarxiv icon

Functional Interpolation for Relative Positions Improves Long Context Transformers

Add code
Oct 06, 2023
Viaarxiv icon

Think before you speak: Training Language Models With Pause Tokens

Add code
Oct 03, 2023
Figure 1 for Think before you speak: Training Language Models With Pause Tokens
Figure 2 for Think before you speak: Training Language Models With Pause Tokens
Figure 3 for Think before you speak: Training Language Models With Pause Tokens
Figure 4 for Think before you speak: Training Language Models With Pause Tokens
Viaarxiv icon

SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models

Add code
Aug 14, 2023
Viaarxiv icon

When Does Confidence-Based Cascade Deferral Suffice?

Add code
Jul 06, 2023
Figure 1 for When Does Confidence-Based Cascade Deferral Suffice?
Figure 2 for When Does Confidence-Based Cascade Deferral Suffice?
Figure 3 for When Does Confidence-Based Cascade Deferral Suffice?
Figure 4 for When Does Confidence-Based Cascade Deferral Suffice?
Viaarxiv icon

Depth Dependence of $μ$P Learning Rates in ReLU MLPs

Add code
May 13, 2023
Viaarxiv icon

ResMem: Learn what you can and memorize the rest

Add code
Feb 03, 2023
Figure 1 for ResMem: Learn what you can and memorize the rest
Figure 2 for ResMem: Learn what you can and memorize the rest
Figure 3 for ResMem: Learn what you can and memorize the rest
Figure 4 for ResMem: Learn what you can and memorize the rest
Viaarxiv icon

Learning to reject meets OOD detection: Are all abstentions created equal?

Add code
Jan 31, 2023
Figure 1 for Learning to reject meets OOD detection: Are all abstentions created equal?
Figure 2 for Learning to reject meets OOD detection: Are all abstentions created equal?
Figure 3 for Learning to reject meets OOD detection: Are all abstentions created equal?
Figure 4 for Learning to reject meets OOD detection: Are all abstentions created equal?
Viaarxiv icon