Alert button

Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens

Add code
Bookmark button
Alert button
Feb 24, 2024
Ziqian Zeng, Jiahong Yu, Qianshi Pang, Zihao Wang, Huiping Zhuang, Cen Chen

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: