Alert button
Picture for Zhihao Jia

Zhihao Jia

Alert button

Accelerating Retrieval-Augmented Language Model Serving with Speculation

Jan 25, 2024
Zhihao Zhang, Alan Zhu, Lijie Yang, Yihua Xu, Lanting Li, Phitchaya Mangpo Phothilimthana, Zhihao Jia

Viaarxiv icon

Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models

Jan 13, 2024
Zhengxin Zhang, Dan Zhao, Xupeng Miao, Gabriele Oliaro, Qing Li, Yong Jiang, Zhihao Jia

Viaarxiv icon

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

Dec 23, 2023
Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Hongyi Jin, Tianqi Chen, Zhihao Jia

Viaarxiv icon

SpotServe: Serving Generative Large Language Models on Preemptible Instances

Nov 27, 2023
Xupeng Miao, Chunan Shi, Jiangfei Duan, Xiaoli Xi, Dahua Lin, Bin Cui, Zhihao Jia

Viaarxiv icon

Drone-NeRF: Efficient NeRF Based 3D Scene Reconstruction for Large-Scale Drone Survey

Aug 30, 2023
Zhihao Jia, Bing Wang, Changhao Chen

Viaarxiv icon

Quarl: A Learning-Based Quantum Circuit Optimizer

Jul 17, 2023
Zikun Li, Jinjun Peng, Yixuan Mei, Sina Lin, Yi Wu, Oded Padon, Zhihao Jia

Figure 1 for Quarl: A Learning-Based Quantum Circuit Optimizer
Figure 2 for Quarl: A Learning-Based Quantum Circuit Optimizer
Figure 3 for Quarl: A Learning-Based Quantum Circuit Optimizer
Figure 4 for Quarl: A Learning-Based Quantum Circuit Optimizer
Viaarxiv icon

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification

May 16, 2023
Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Rae Ying Yee Wong, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia

Figure 1 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 2 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 3 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Figure 4 for SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Viaarxiv icon

Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks

Oct 02, 2022
Zhihao Zhang, Zhuoming Chen, Heyang Huang, Zhihao Jia

Figure 1 for Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks
Figure 2 for Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks
Figure 3 for Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks
Figure 4 for Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks
Viaarxiv icon

OLLIE: Derivation-based Tensor Program Optimizer

Aug 02, 2022
Liyan Zheng, Haojie Wang, Jidong Zhai, Muyan Hu, Zixuan Ma, Tuowei Wang, Shizhi Tang, Lei Xie, Kezhao Huang, Zhihao Jia

Figure 1 for OLLIE: Derivation-based Tensor Program Optimizer
Figure 2 for OLLIE: Derivation-based Tensor Program Optimizer
Figure 3 for OLLIE: Derivation-based Tensor Program Optimizer
Figure 4 for OLLIE: Derivation-based Tensor Program Optimizer
Viaarxiv icon

Benchmarking Node Outlier Detection on Graphs

Jun 21, 2022
Kay Liu, Yingtong Dou, Yue Zhao, Xueying Ding, Xiyang Hu, Ruitong Zhang, Kaize Ding, Canyu Chen, Hao Peng, Kai Shu, Lichao Sun, Jundong Li, George H. Chen, Zhihao Jia, Philip S. Yu

Figure 1 for Benchmarking Node Outlier Detection on Graphs
Figure 2 for Benchmarking Node Outlier Detection on Graphs
Figure 3 for Benchmarking Node Outlier Detection on Graphs
Figure 4 for Benchmarking Node Outlier Detection on Graphs
Viaarxiv icon