Alert button
Picture for Dan Alistarh

Dan Alistarh

Alert button

Knowledge Distillation Performs Partial Variance Reduction

Add code
Bookmark button
Alert button
May 27, 2023
Mher Safaryan, Alexandra Peste, Dan Alistarh

Figure 1 for Knowledge Distillation Performs Partial Variance Reduction
Figure 2 for Knowledge Distillation Performs Partial Variance Reduction
Figure 3 for Knowledge Distillation Performs Partial Variance Reduction
Figure 4 for Knowledge Distillation Performs Partial Variance Reduction
Viaarxiv icon

Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures

Add code
Bookmark button
Alert button
Apr 25, 2023
Eugenia Iofinova, Alexandra Peste, Dan Alistarh

Figure 1 for Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
Figure 2 for Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
Figure 3 for Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
Figure 4 for Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
Viaarxiv icon

Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression

Add code
Bookmark button
Alert button
Mar 25, 2023
Denis Kuznedelev, Soroush Tabesh, Kimia Noorbakhsh, Elias Frantar, Sara Beery, Eldar Kurtic, Dan Alistarh

Figure 1 for Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression
Figure 2 for Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression
Figure 3 for Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression
Figure 4 for Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression
Viaarxiv icon

SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks

Add code
Bookmark button
Alert button
Feb 09, 2023
Mahdi Nikdan, Tommaso Pegolotti, Eugenia Iofinova, Eldar Kurtic, Dan Alistarh

Figure 1 for SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks
Figure 2 for SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks
Figure 3 for SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks
Figure 4 for SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks
Viaarxiv icon

ZipLM: Hardware-Aware Structured Pruning of Language Models

Add code
Bookmark button
Alert button
Feb 07, 2023
Eldar Kurtic, Elias Frantar, Dan Alistarh

Figure 1 for ZipLM: Hardware-Aware Structured Pruning of Language Models
Figure 2 for ZipLM: Hardware-Aware Structured Pruning of Language Models
Figure 3 for ZipLM: Hardware-Aware Structured Pruning of Language Models
Figure 4 for ZipLM: Hardware-Aware Structured Pruning of Language Models
Viaarxiv icon

Quantized Distributed Training of Large Models with Convergence Guarantees

Add code
Bookmark button
Alert button
Feb 05, 2023
Ilia Markov, Adrian Vladu, Qi Guo, Dan Alistarh

Figure 1 for Quantized Distributed Training of Large Models with Convergence Guarantees
Figure 2 for Quantized Distributed Training of Large Models with Convergence Guarantees
Figure 3 for Quantized Distributed Training of Large Models with Convergence Guarantees
Figure 4 for Quantized Distributed Training of Large Models with Convergence Guarantees
Viaarxiv icon

Massive Language Models Can Be Accurately Pruned in One-Shot

Add code
Bookmark button
Alert button
Jan 02, 2023
Elias Frantar, Dan Alistarh

Figure 1 for Massive Language Models Can Be Accurately Pruned in One-Shot
Figure 2 for Massive Language Models Can Be Accurately Pruned in One-Shot
Figure 3 for Massive Language Models Can Be Accurately Pruned in One-Shot
Figure 4 for Massive Language Models Can Be Accurately Pruned in One-Shot
Viaarxiv icon

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Add code
Bookmark button
Alert button
Oct 31, 2022
Mohammadreza Alimohammadi, Ilia Markov, Elias Frantar, Dan Alistarh

Figure 1 for L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression
Figure 2 for L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression
Figure 3 for L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression
Figure 4 for L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression
Viaarxiv icon

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Add code
Bookmark button
Alert button
Oct 31, 2022
Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh

Figure 1 for GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Figure 2 for GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Figure 3 for GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Figure 4 for GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Viaarxiv icon

oViT: An Accurate Second-Order Pruning Framework for Vision Transformers

Add code
Bookmark button
Alert button
Oct 14, 2022
Denis Kuznedelev, Eldar Kurtic, Elias Frantar, Dan Alistarh

Figure 1 for oViT: An Accurate Second-Order Pruning Framework for Vision Transformers
Figure 2 for oViT: An Accurate Second-Order Pruning Framework for Vision Transformers
Figure 3 for oViT: An Accurate Second-Order Pruning Framework for Vision Transformers
Figure 4 for oViT: An Accurate Second-Order Pruning Framework for Vision Transformers
Viaarxiv icon