Alert button
Picture for Xinhao Cheng

Xinhao Cheng

Alert button

FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning

Add code
Bookmark button
Alert button
Feb 29, 2024
Xupeng Miao, Gabriele Oliaro, Xinhao Cheng, Mengdi Wu, Colin Unger, Zhihao Jia

Viaarxiv icon

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

Add code
Bookmark button
Alert button
Dec 23, 2023
Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Hongyi Jin, Tianqi Chen, Zhihao Jia

Viaarxiv icon

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification

Add code
Bookmark button
Alert button
May 16, 2023
Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Rae Ying Yee Wong, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia

Figure 1 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 2 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 3 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 4 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Viaarxiv icon