Picture for Renee St. Amant

Renee St. Amant

Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers

Add code
May 17, 2024
Figure 1 for Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
Figure 2 for Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
Figure 3 for Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
Figure 4 for Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
Viaarxiv icon