Alert button

Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding

Add code
Bookmark button
Alert button
Jan 15, 2024
Heming Xia, Zhe Yang, Qingxiu Dong, Peiyi Wang, Yongqi Li, Tao Ge, Tianyu Liu, Wenjie Li, Zhifang Sui

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: