Alert button
Picture for Hiva Mohammadzadeh

Hiva Mohammadzadeh

Alert button

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Add code
Bookmark button
Alert button
Feb 07, 2024
Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami

Viaarxiv icon

SPEED: Speculative Pipelined Execution for Efficient Decoding

Add code
Bookmark button
Alert button
Oct 18, 2023
Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Hasan Genc, Kurt Keutzer, Amir Gholami, Sophia Shao

Viaarxiv icon