Alert button

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Dec 12, 2023
Keivan Alizadeh, Iman Mirzadeh, Dmitry Belenko, Karen Khatamifard, Minsik Cho, Carlo C Del Mundo, Mohammad Rastegari, Mehrdad Farajtabar

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: