Alert button
Picture for Elias Frantar

Elias Frantar

Alert button

Extreme Compression of Large Language Models via Additive Quantization

Add code
Bookmark button
Alert button
Jan 11, 2024
Vage Egiazarian, Andrei Panferov, Denis Kuznedelev, Elias Frantar, Artem Babenko, Dan Alistarh

Viaarxiv icon

QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models

Add code
Bookmark button
Alert button
Oct 25, 2023
Elias Frantar, Dan Alistarh

Viaarxiv icon

Towards End-to-end 4-Bit Inference on Generative Large Language Models

Add code
Bookmark button
Alert button
Oct 13, 2023
Saleh Ashkboos, Ilia Markov, Elias Frantar, Tingxuan Zhong, Xincheng Wang, Jie Ren, Torsten Hoefler, Dan Alistarh

Viaarxiv icon

Sparse Fine-tuning for Inference Acceleration of Large Language Models

Add code
Bookmark button
Alert button
Oct 13, 2023
Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, Dan Alistarh

Viaarxiv icon

Sparse Finetuning for Inference Acceleration of Large Language Models

Add code
Bookmark button
Alert button
Oct 10, 2023
Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, Dan Alistarh

Viaarxiv icon

Scaling Laws for Sparsely-Connected Foundation Models

Add code
Bookmark button
Alert button
Sep 15, 2023
Elias Frantar, Carlos Riquelme, Neil Houlsby, Dan Alistarh, Utku Evci

Figure 1 for Scaling Laws for Sparsely-Connected Foundation Models
Figure 2 for Scaling Laws for Sparsely-Connected Foundation Models
Figure 3 for Scaling Laws for Sparsely-Connected Foundation Models
Figure 4 for Scaling Laws for Sparsely-Connected Foundation Models
Viaarxiv icon

Accurate Neural Network Pruning Requires Rethinking Sparse Optimization

Add code
Bookmark button
Alert button
Aug 03, 2023
Denis Kuznedelev, Eldar Kurtic, Eugenia Iofinova, Elias Frantar, Alexandra Peste, Dan Alistarh

Figure 1 for Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Figure 2 for Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Figure 3 for Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Figure 4 for Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Viaarxiv icon

QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models

Add code
Bookmark button
Alert button
Jul 07, 2023
Tommaso Pegolotti, Elias Frantar, Dan Alistarh, Markus Püschel

Figure 1 for QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models
Figure 2 for QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models
Figure 3 for QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models
Figure 4 for QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models
Viaarxiv icon

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

Add code
Bookmark button
Alert button
Jun 05, 2023
Tim Dettmers, Ruslan Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, Dan Alistarh

Figure 1 for SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Figure 2 for SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Figure 3 for SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Figure 4 for SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Viaarxiv icon

JaxPruner: A concise library for sparsity research

Add code
Bookmark button
Alert button
May 02, 2023
Joo Hyung Lee, Wonpyo Park, Nicole Mitchell, Jonathan Pilault, Johan Obando-Ceron, Han-Byul Kim, Namhoon Lee, Elias Frantar, Yun Long, Amir Yazdanbakhsh, Shivani Agrawal, Suvinay Subramanian, Xin Wang, Sheng-Chun Kao, Xingyao Zhang, Trevor Gale, Aart Bik, Woohyun Han, Milen Ferev, Zhonglin Han, Hong-Seok Kim, Yann Dauphin, Gintare Karolina Dziugaite, Pablo Samuel Castro, Utku Evci

Figure 1 for JaxPruner: A concise library for sparsity research
Figure 2 for JaxPruner: A concise library for sparsity research
Viaarxiv icon