Alert button
Picture for Ankit Singh Rawat

Ankit Singh Rawat

Alert button

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

Feb 21, 2024
M. Emrullah Ildiz, Yixiao Huang, Yingcong Li, Ankit Singh Rawat, Samet Oymak

Viaarxiv icon

DistillSpec: Improving Speculative Decoding via Knowledge Distillation

Oct 12, 2023
Yongchao Zhou, Kaifeng Lyu, Ankit Singh Rawat, Aditya Krishna Menon, Afshin Rostamizadeh, Sanjiv Kumar, Jean-François Kagy, Rishabh Agarwal

Figure 1 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 2 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 3 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 4 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Viaarxiv icon

What do larger image classifiers memorise?

Oct 09, 2023
Michal Lukasik, Vaishnavh Nagarajan, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar

Figure 1 for What do larger image classifiers memorise?
Figure 2 for What do larger image classifiers memorise?
Figure 3 for What do larger image classifiers memorise?
Figure 4 for What do larger image classifiers memorise?
Viaarxiv icon

Think before you speak: Training Language Models With Pause Tokens

Oct 03, 2023
Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan

Figure 1 for Think before you speak: Training Language Models With Pause Tokens
Figure 2 for Think before you speak: Training Language Models With Pause Tokens
Figure 3 for Think before you speak: Training Language Models With Pause Tokens
Figure 4 for Think before you speak: Training Language Models With Pause Tokens
Viaarxiv icon

When Does Confidence-Based Cascade Deferral Suffice?

Jul 06, 2023
Wittawat Jitkrittum, Neha Gupta, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sanjiv Kumar

Figure 1 for When Does Confidence-Based Cascade Deferral Suffice?
Figure 2 for When Does Confidence-Based Cascade Deferral Suffice?
Figure 3 for When Does Confidence-Based Cascade Deferral Suffice?
Figure 4 for When Does Confidence-Based Cascade Deferral Suffice?
Viaarxiv icon

On the Role of Attention in Prompt-tuning

Jun 06, 2023
Samet Oymak, Ankit Singh Rawat, Mahdi Soltanolkotabi, Christos Thrampoulidis

Figure 1 for On the Role of Attention in Prompt-tuning
Figure 2 for On the Role of Attention in Prompt-tuning
Figure 3 for On the Role of Attention in Prompt-tuning
Figure 4 for On the Role of Attention in Prompt-tuning
Viaarxiv icon

ResMem: Learn what you can and memorize the rest

Feb 03, 2023
Zitong Yang, Michal Lukasik, Vaishnavh Nagarajan, Zonglin Li, Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Sanjiv Kumar

Figure 1 for ResMem: Learn what you can and memorize the rest
Figure 2 for ResMem: Learn what you can and memorize the rest
Figure 3 for ResMem: Learn what you can and memorize the rest
Figure 4 for ResMem: Learn what you can and memorize the rest
Viaarxiv icon

Supervision Complexity and its Role in Knowledge Distillation

Jan 28, 2023
Hrayr Harutyunyan, Ankit Singh Rawat, Aditya Krishna Menon, Seungyeon Kim, Sanjiv Kumar

Figure 1 for Supervision Complexity and its Role in Knowledge Distillation
Figure 2 for Supervision Complexity and its Role in Knowledge Distillation
Figure 3 for Supervision Complexity and its Role in Knowledge Distillation
Figure 4 for Supervision Complexity and its Role in Knowledge Distillation
Viaarxiv icon

EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval

Jan 27, 2023
Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, Sadeep Jayasumana, Veeranjaneyulu Sadhanala, Wittawat Jitkrittum, Aditya Krishna Menon, Rob Fergus, Sanjiv Kumar

Figure 1 for EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Figure 2 for EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Figure 3 for EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Figure 4 for EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Viaarxiv icon

Large Language Models with Controllable Working Memory

Nov 09, 2022
Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar

Figure 1 for Large Language Models with Controllable Working Memory
Figure 2 for Large Language Models with Controllable Working Memory
Figure 3 for Large Language Models with Controllable Working Memory
Figure 4 for Large Language Models with Controllable Working Memory
Viaarxiv icon