Alert button
Picture for Carlo C Del Mundo

Carlo C Del Mundo

Alert button

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Add code
Bookmark button
Alert button
Dec 12, 2023
Keivan Alizadeh, Iman Mirzadeh, Dmitry Belenko, Karen Khatamifard, Minsik Cho, Carlo C Del Mundo, Mohammad Rastegari, Mehrdad Farajtabar

Viaarxiv icon

ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models

Add code
Bookmark button
Alert button
Oct 06, 2023
Iman Mirzadeh, Keivan Alizadeh, Sachin Mehta, Carlo C Del Mundo, Oncel Tuzel, Golnoosh Samei, Mohammad Rastegari, Mehrdad Farajtabar

Figure 1 for ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
Figure 2 for ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
Figure 3 for ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
Figure 4 for ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
Viaarxiv icon

eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models

Add code
Bookmark button
Alert button
Sep 13, 2023
Minsik Cho, Keivan A. Vahid, Qichen Fu, Saurabh Adya, Carlo C Del Mundo, Mohammad Rastegari, Devang Naik, Peter Zatloukal

Figure 1 for eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Figure 2 for eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Figure 3 for eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Figure 4 for eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Viaarxiv icon