Alert button
Picture for Zhihao Jia

Zhihao Jia

Alert button

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

Add code
Bookmark button
Alert button
Feb 29, 2024
Zhuoming Chen, Avner May, Ruslan Svirschevski, Yuhsun Huang, Max Ryabinin, Zhihao Jia, Beidi Chen

Viaarxiv icon

FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning

Add code
Bookmark button
Alert button
Feb 29, 2024
Xupeng Miao, Gabriele Oliaro, Xinhao Cheng, Mengdi Wu, Colin Unger, Zhihao Jia

Viaarxiv icon

Accelerating Retrieval-Augmented Language Model Serving with Speculation

Add code
Bookmark button
Alert button
Jan 25, 2024
Zhihao Zhang, Alan Zhu, Lijie Yang, Yihua Xu, Lanting Li, Phitchaya Mangpo Phothilimthana, Zhihao Jia

Viaarxiv icon

Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models

Add code
Bookmark button
Alert button
Jan 13, 2024
Zhengxin Zhang, Dan Zhao, Xupeng Miao, Gabriele Oliaro, Qing Li, Yong Jiang, Zhihao Jia

Viaarxiv icon

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

Add code
Bookmark button
Alert button
Dec 23, 2023
Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Hongyi Jin, Tianqi Chen, Zhihao Jia

Viaarxiv icon

SpotServe: Serving Generative Large Language Models on Preemptible Instances

Add code
Bookmark button
Alert button
Nov 27, 2023
Xupeng Miao, Chunan Shi, Jiangfei Duan, Xiaoli Xi, Dahua Lin, Bin Cui, Zhihao Jia

Viaarxiv icon

Drone-NeRF: Efficient NeRF Based 3D Scene Reconstruction for Large-Scale Drone Survey

Add code
Bookmark button
Alert button
Aug 30, 2023
Zhihao Jia, Bing Wang, Changhao Chen

Viaarxiv icon

Quarl: A Learning-Based Quantum Circuit Optimizer

Add code
Bookmark button
Alert button
Jul 17, 2023
Zikun Li, Jinjun Peng, Yixuan Mei, Sina Lin, Yi Wu, Oded Padon, Zhihao Jia

Figure 1 for Quarl: A Learning-Based Quantum Circuit Optimizer
Figure 2 for Quarl: A Learning-Based Quantum Circuit Optimizer
Figure 3 for Quarl: A Learning-Based Quantum Circuit Optimizer
Figure 4 for Quarl: A Learning-Based Quantum Circuit Optimizer
Viaarxiv icon

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification

Add code
Bookmark button
Alert button
May 16, 2023
Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Rae Ying Yee Wong, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia

Figure 1 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 2 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 3 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 4 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Viaarxiv icon