Alert button
Picture for Kurt Keutzer

Kurt Keutzer

Alert button

AI and Memory Wall

Mar 21, 2024
Amir Gholami, Zhewei Yao, Sehoon Kim, Coleman Hooper, Michael W. Mahoney, Kurt Keutzer

Viaarxiv icon

ROUTERBENCH: A Benchmark for Multi-LLM Routing System

Mar 18, 2024
Qitian Jason Hu, Jacob Bieker, Xiuyu Li, Nan Jiang, Benjamin Keigwin, Gaurav Ranganath, Kurt Keutzer, Shriyash Kaustubh Upadhyay

Viaarxiv icon

Q-SLAM: Quadric Representations for Monocular SLAM

Mar 12, 2024
Chensheng Peng, Chenfeng Xu, Yue Wang, Mingyu Ding, Heng Yang, Masayoshi Tomizuka, Kurt Keutzer, Marco Pavone, Wei Zhan

Viaarxiv icon

LLM Inference Unveiled: Survey and Roofline Model Insights

Mar 11, 2024
Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer

Viaarxiv icon

Magic-Me: Identity-Specific Video Customized Diffusion

Feb 14, 2024
Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li, Huanrui Yang, Zhen Dong, Kurt Keutzer, Jiashi Feng

Viaarxiv icon

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Feb 07, 2024
Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami

Viaarxiv icon

Learned Best-Effort LLM Serving

Jan 15, 2024
Siddharth Jha, Coleman Hooper, Xiaoxuan Liu, Sehoon Kim, Kurt Keutzer

Viaarxiv icon