Alert button

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification

May 16, 2023
Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Rae Ying Yee Wong, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia

Figure 1 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 2 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 3 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 4 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: