Picture for Torsten Hoefler

Torsten Hoefler

Multi-Head RAG: Solving Multi-Aspect Problems with LLMs

Add code
Jun 07, 2024
Viaarxiv icon

CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks

Add code
Jun 04, 2024
Figure 1 for CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks
Figure 2 for CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks
Figure 3 for CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks
Figure 4 for CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks
Viaarxiv icon

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

Add code
Mar 30, 2024
Figure 1 for QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Figure 2 for QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Figure 3 for QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Figure 4 for QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Viaarxiv icon

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Add code
Jan 26, 2024
Figure 1 for SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Figure 2 for SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Figure 3 for SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Figure 4 for SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Viaarxiv icon

Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts

Add code
Jan 25, 2024
Viaarxiv icon

Swing: Short-cutting Rings for Higher Bandwidth Allreduce

Add code
Jan 17, 2024
Viaarxiv icon

DiffDA: a diffusion model for weather-scale data assimilation

Add code
Jan 11, 2024
Viaarxiv icon

How to Prune Your Language Model: Recovering Accuracy on the "Sparsity May Cry'' Benchmark

Add code
Dec 21, 2023
Viaarxiv icon

HOT: Higher-Order Dynamic Graph Representation Learning with Efficient Transformers

Add code
Nov 30, 2023
Viaarxiv icon

Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models

Add code
Oct 15, 2023
Figure 1 for Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Figure 2 for Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Figure 3 for Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Figure 4 for Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Viaarxiv icon