Alert button

CHAI: Clustered Head Attention for Efficient LLM Inference

Add code
Bookmark button
Alert button
Mar 12, 2024
Saurabh Agarwal, Bilge Acun, Basil Homer, Mostafa Elhoushi, Yejin Lee, Shivaram Venkataraman, Dimitris Papailiopoulos, Carole-Jean Wu

Figure 1 for CHAI: Clustered Head Attention for Efficient LLM Inference
Figure 2 for CHAI: Clustered Head Attention for Efficient LLM Inference
Figure 3 for CHAI: Clustered Head Attention for Efficient LLM Inference
Figure 4 for CHAI: Clustered Head Attention for Efficient LLM Inference

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: