Picture for Anshumali Shrivastava

Anshumali Shrivastava

LeanQuant: Accurate Large Language Model Quantization with Loss-Error-Aware Grid

Add code
Jul 14, 2024
Viaarxiv icon

IDentity with Locality: An ideal hash for gene sequence search

Add code
Jun 21, 2024
Viaarxiv icon

KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization

Add code
May 07, 2024
Viaarxiv icon

NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention

Add code
Mar 02, 2024
Figure 1 for NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Figure 2 for NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Figure 3 for NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Figure 4 for NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Viaarxiv icon

Wisdom of Committee: Distilling from Foundation Model to Specialized Application Model

Add code
Feb 27, 2024
Figure 1 for Wisdom of Committee: Distilling from Foundation Model to Specialized Application Model
Figure 2 for Wisdom of Committee: Distilling from Foundation Model to Specialized Application Model
Figure 3 for Wisdom of Committee: Distilling from Foundation Model to Specialized Application Model
Figure 4 for Wisdom of Committee: Distilling from Foundation Model to Specialized Application Model
Viaarxiv icon

Learning Scalable Structural Representations for Link Prediction with Bloom Signatures

Add code
Dec 28, 2023
Viaarxiv icon

Contractive error feedback for gradient compression

Add code
Dec 13, 2023
Figure 1 for Contractive error feedback for gradient compression
Figure 2 for Contractive error feedback for gradient compression
Figure 3 for Contractive error feedback for gradient compression
Figure 4 for Contractive error feedback for gradient compression
Viaarxiv icon

Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies

Add code
Nov 22, 2023
Figure 1 for Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies
Figure 2 for Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies
Figure 3 for Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies
Figure 4 for Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies
Viaarxiv icon

Heterogeneous federated collaborative filtering using FAIR: Federated Averaging in Random Subspaces

Add code
Nov 03, 2023
Figure 1 for Heterogeneous federated collaborative filtering using FAIR: Federated Averaging in Random Subspaces
Figure 2 for Heterogeneous federated collaborative filtering using FAIR: Federated Averaging in Random Subspaces
Figure 3 for Heterogeneous federated collaborative filtering using FAIR: Federated Averaging in Random Subspaces
Figure 4 for Heterogeneous federated collaborative filtering using FAIR: Federated Averaging in Random Subspaces
Viaarxiv icon

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Add code
Oct 26, 2023
Figure 1 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 2 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 3 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 4 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Viaarxiv icon