Alert button
Picture for Torsten Hoefler

Torsten Hoefler

Alert button

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

Add code
Bookmark button
Alert button
Mar 30, 2024
Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci, Bo Li, Martin Jaggi, Dan Alistarh, Torsten Hoefler, James Hensman

Viaarxiv icon

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Add code
Bookmark button
Alert button
Jan 26, 2024
Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari do Nascimento, Torsten Hoefler, James Hensman

Viaarxiv icon

Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts

Add code
Bookmark button
Alert button
Jan 25, 2024
Maciej Besta, Florim Memedi, Zhenyu Zhang, Robert Gerstenberger, Nils Blach, Piotr Nyczyk, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Lukas Gianinazzi, Ales Kubicek, Hubert Niewiadomski, Onur Mutlu, Torsten Hoefler

Viaarxiv icon

Swing: Short-cutting Rings for Higher Bandwidth Allreduce

Add code
Bookmark button
Alert button
Jan 17, 2024
Daniele De Sensi, Tommaso Bonato, David Saam, Torsten Hoefler

Viaarxiv icon

DiffDA: a diffusion model for weather-scale data assimilation

Add code
Bookmark button
Alert button
Jan 11, 2024
Langwen Huang, Lukas Gianinazzi, Yuejiang Yu, Peter D. Dueben, Torsten Hoefler

Viaarxiv icon

How to Prune Your Language Model: Recovering Accuracy on the "Sparsity May Cry'' Benchmark

Add code
Bookmark button
Alert button
Dec 21, 2023
Eldar Kurtic, Torsten Hoefler, Dan Alistarh

Viaarxiv icon

HOT: Higher-Order Dynamic Graph Representation Learning with Efficient Transformers

Add code
Bookmark button
Alert button
Nov 30, 2023
Maciej Besta, Afonso Claudino Catarino, Lukas Gianinazzi, Nils Blach, Piotr Nyczyk, Hubert Niewiadomski, Torsten Hoefler

Viaarxiv icon

Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models

Add code
Bookmark button
Alert button
Oct 15, 2023
Wenqi Jiang, Marco Zeller, Roger Waleffe, Torsten Hoefler, Gustavo Alonso

Viaarxiv icon

Towards End-to-end 4-Bit Inference on Generative Large Language Models

Add code
Bookmark button
Alert button
Oct 13, 2023
Saleh Ashkboos, Ilia Markov, Elias Frantar, Tingxuan Zhong, Xincheng Wang, Jie Ren, Torsten Hoefler, Dan Alistarh

Viaarxiv icon

VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores

Add code
Bookmark button
Alert button
Oct 03, 2023
Roberto L. Castro, Andrei Ivanov, Diego Andrade, Tal Ben-Nun, Basilio B. Fraguela, Torsten Hoefler

Figure 1 for VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
Figure 2 for VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
Figure 3 for VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
Figure 4 for VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
Viaarxiv icon