Picture for Nikhil Bhendawade

Nikhil Bhendawade

Speculative Streaming: Fast LLM Inference without Auxiliary Models

Add code
Feb 16, 2024
Viaarxiv icon

FastSeq: Make Sequence Generation Faster

Add code
Jun 08, 2021
Figure 1 for FastSeq: Make Sequence Generation Faster
Figure 2 for FastSeq: Make Sequence Generation Faster
Figure 3 for FastSeq: Make Sequence Generation Faster
Figure 4 for FastSeq: Make Sequence Generation Faster
Viaarxiv icon

EL-Attention: Memory Efficient Lossless Attention for Generation

Add code
May 11, 2021
Figure 1 for EL-Attention: Memory Efficient Lossless Attention for Generation
Figure 2 for EL-Attention: Memory Efficient Lossless Attention for Generation
Figure 3 for EL-Attention: Memory Efficient Lossless Attention for Generation
Figure 4 for EL-Attention: Memory Efficient Lossless Attention for Generation
Viaarxiv icon