Alert button
Picture for Sehoon Kim

Sehoon Kim

Alert button

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Feb 07, 2024
Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami

Viaarxiv icon

Learned Best-Effort LLM Serving

Jan 15, 2024
Siddharth Jha, Coleman Hooper, Xiaoxuan Liu, Sehoon Kim, Kurt Keutzer

Viaarxiv icon

An LLM Compiler for Parallel Function Calling

Dec 07, 2023
Sehoon Kim, Suhong Moon, Ryan Tabrizi, Nicholas Lee, Michael W. Mahoney, Kurt Keutzer, Amir Gholami

Viaarxiv icon

SPEED: Speculative Pipelined Execution for Efficient Decoding

Oct 18, 2023
Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Hasan Genc, Kurt Keutzer, Amir Gholami, Sophia Shao

Viaarxiv icon

SqueezeLLM: Dense-and-Sparse Quantization

Jun 13, 2023
Sehoon Kim, Coleman Hooper, Amir Gholami, Zhen Dong, Xiuyu Li, Sheng Shen, Michael W. Mahoney, Kurt Keutzer

Figure 1 for SqueezeLLM: Dense-and-Sparse Quantization
Figure 2 for SqueezeLLM: Dense-and-Sparse Quantization
Figure 3 for SqueezeLLM: Dense-and-Sparse Quantization
Figure 4 for SqueezeLLM: Dense-and-Sparse Quantization
Viaarxiv icon

Full Stack Optimization of Transformer Inference: a Survey

Feb 27, 2023
Sehoon Kim, Coleman Hooper, Thanakul Wattanawong, Minwoo Kang, Ruohan Yan, Hasan Genc, Grace Dinh, Qijing Huang, Kurt Keutzer, Michael W. Mahoney, Yakun Sophia Shao, Amir Gholami

Figure 1 for Full Stack Optimization of Transformer Inference: a Survey
Figure 2 for Full Stack Optimization of Transformer Inference: a Survey
Figure 3 for Full Stack Optimization of Transformer Inference: a Survey
Figure 4 for Full Stack Optimization of Transformer Inference: a Survey
Viaarxiv icon

Big Little Transformer Decoder

Feb 15, 2023
Sehoon Kim, Karttikeya Mangalam, Jitendra Malik, Michael W. Mahoney, Amir Gholami, Kurt Keutzer

Figure 1 for Big Little Transformer Decoder
Figure 2 for Big Little Transformer Decoder
Figure 3 for Big Little Transformer Decoder
Figure 4 for Big Little Transformer Decoder
Viaarxiv icon

BigColor: Colorization using a Generative Color Prior for Natural Images

Jul 20, 2022
Geonung Kim, Kyoungkook Kang, Seongtae Kim, Hwayoon Lee, Sehoon Kim, Jonghyun Kim, Seung-Hwan Baek, Sunghyun Cho

Figure 1 for BigColor: Colorization using a Generative Color Prior for Natural Images
Figure 2 for BigColor: Colorization using a Generative Color Prior for Natural Images
Figure 3 for BigColor: Colorization using a Generative Color Prior for Natural Images
Figure 4 for BigColor: Colorization using a Generative Color Prior for Natural Images
Viaarxiv icon

Squeezeformer: An Efficient Transformer for Automatic Speech Recognition

Jun 02, 2022
Sehoon Kim, Amir Gholami, Albert Shaw, Nicholas Lee, Karttikeya Mangalam, Jitendra Malik, Michael W. Mahoney, Kurt Keutzer

Figure 1 for Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Figure 2 for Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Figure 3 for Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Figure 4 for Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Viaarxiv icon