Alert button
Picture for Sanjiv Kumar

Sanjiv Kumar

Alert button

It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models

Add code
Bookmark button
Alert button
Oct 13, 2023
Lin Chen, Michal Lukasik, Wittawat Jitkrittum, Chong You, Sanjiv Kumar

Viaarxiv icon

DistillSpec: Improving Speculative Decoding via Knowledge Distillation

Add code
Bookmark button
Alert button
Oct 12, 2023
Yongchao Zhou, Kaifeng Lyu, Ankit Singh Rawat, Aditya Krishna Menon, Afshin Rostamizadeh, Sanjiv Kumar, Jean-François Kagy, Rishabh Agarwal

Figure 1 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 2 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 3 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 4 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Viaarxiv icon

What do larger image classifiers memorise?

Add code
Bookmark button
Alert button
Oct 09, 2023
Michal Lukasik, Vaishnavh Nagarajan, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar

Figure 1 for What do larger image classifiers memorise?
Figure 2 for What do larger image classifiers memorise?
Figure 3 for What do larger image classifiers memorise?
Figure 4 for What do larger image classifiers memorise?
Viaarxiv icon

Functional Interpolation for Relative Positions Improves Long Context Transformers

Add code
Bookmark button
Alert button
Oct 06, 2023
Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli

Viaarxiv icon

Think before you speak: Training Language Models With Pause Tokens

Add code
Bookmark button
Alert button
Oct 03, 2023
Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan

Figure 1 for Think before you speak: Training Language Models With Pause Tokens
Figure 2 for Think before you speak: Training Language Models With Pause Tokens
Figure 3 for Think before you speak: Training Language Models With Pause Tokens
Figure 4 for Think before you speak: Training Language Models With Pause Tokens
Viaarxiv icon

SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models

Add code
Bookmark button
Alert button
Aug 14, 2023
Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam, Andreas Veit, Ayan Chakrabarti, Sanjiv Kumar

Figure 1 for SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models
Figure 2 for SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models
Figure 3 for SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models
Figure 4 for SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models
Viaarxiv icon

When Does Confidence-Based Cascade Deferral Suffice?

Add code
Bookmark button
Alert button
Jul 06, 2023
Wittawat Jitkrittum, Neha Gupta, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sanjiv Kumar

Figure 1 for When Does Confidence-Based Cascade Deferral Suffice?
Figure 2 for When Does Confidence-Based Cascade Deferral Suffice?
Figure 3 for When Does Confidence-Based Cascade Deferral Suffice?
Figure 4 for When Does Confidence-Based Cascade Deferral Suffice?
Viaarxiv icon

Depth Dependence of $μ$P Learning Rates in ReLU MLPs

Add code
Bookmark button
Alert button
May 13, 2023
Samy Jelassi, Boris Hanin, Ziwei Ji, Sashank J. Reddi, Srinadh Bhojanapalli, Sanjiv Kumar

Viaarxiv icon

ResMem: Learn what you can and memorize the rest

Add code
Bookmark button
Alert button
Feb 03, 2023
Zitong Yang, Michal Lukasik, Vaishnavh Nagarajan, Zonglin Li, Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Sanjiv Kumar

Figure 1 for ResMem: Learn what you can and memorize the rest
Figure 2 for ResMem: Learn what you can and memorize the rest
Figure 3 for ResMem: Learn what you can and memorize the rest
Figure 4 for ResMem: Learn what you can and memorize the rest
Viaarxiv icon

Learning to reject meets OOD detection: Are all abstentions created equal?

Add code
Bookmark button
Alert button
Jan 31, 2023
Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Sanjiv Kumar

Figure 1 for Learning to reject meets OOD detection: Are all abstentions created equal?
Figure 2 for Learning to reject meets OOD detection: Are all abstentions created equal?
Figure 3 for Learning to reject meets OOD detection: Are all abstentions created equal?
Figure 4 for Learning to reject meets OOD detection: Are all abstentions created equal?
Viaarxiv icon