Alert button
Picture for Saurabh Agarwal

Saurabh Agarwal

Alert button

CHAI: Clustered Head Attention for Efficient LLM Inference

Mar 12, 2024
Saurabh Agarwal, Bilge Acun, Basil Homer, Mostafa Elhoushi, Yejin Lee, Shivaram Venkataraman, Dimitris Papailiopoulos, Carole-Jean Wu

Viaarxiv icon

Decoding Speculative Decoding

Feb 02, 2024
Minghao Yan, Saurabh Agarwal, Shivaram Venkataraman

Viaarxiv icon

MultiFusionNet: Multilayer Multimodal Fusion of Deep Neural Networks for Chest X-Ray Image Classification

Jan 01, 2024
Saurabh Agarwal, K. V. Arya, Yogesh Kumar Meena

Viaarxiv icon

Cuttlefish: Low-Rank Model Training without All the Tuning

May 05, 2023
Hongyi Wang, Saurabh Agarwal, Pongsakorn U-chupala, Yoshiki Tanaka, Eric P. Xing, Dimitris Papailiopoulos

Figure 1 for Cuttlefish: Low-Rank Model Training without All the Tuning
Figure 2 for Cuttlefish: Low-Rank Model Training without All the Tuning
Figure 3 for Cuttlefish: Low-Rank Model Training without All the Tuning
Figure 4 for Cuttlefish: Low-Rank Model Training without All the Tuning
Viaarxiv icon

BagPipe: Accelerating Deep Recommendation Model Training

Feb 24, 2022
Saurabh Agarwal, Ziyi Zhang, Shivaram Venkataraman

Figure 1 for BagPipe: Accelerating Deep Recommendation Model Training
Figure 2 for BagPipe: Accelerating Deep Recommendation Model Training
Figure 3 for BagPipe: Accelerating Deep Recommendation Model Training
Figure 4 for BagPipe: Accelerating Deep Recommendation Model Training
Viaarxiv icon

Pufferfish: Communication-efficient Models At No Extra Cost

Mar 05, 2021
Hongyi Wang, Saurabh Agarwal, Dimitris Papailiopoulos

Figure 1 for Pufferfish: Communication-efficient Models At No Extra Cost
Figure 2 for Pufferfish: Communication-efficient Models At No Extra Cost
Figure 3 for Pufferfish: Communication-efficient Models At No Extra Cost
Figure 4 for Pufferfish: Communication-efficient Models At No Extra Cost
Viaarxiv icon

On the Utility of Gradient Compression in Distributed Training Systems

Mar 03, 2021
Saurabh Agarwal, Hongyi Wang, Shivaram Venkataraman, Dimitris Papailiopoulos

Figure 1 for On the Utility of Gradient Compression in Distributed Training Systems
Figure 2 for On the Utility of Gradient Compression in Distributed Training Systems
Figure 3 for On the Utility of Gradient Compression in Distributed Training Systems
Figure 4 for On the Utility of Gradient Compression in Distributed Training Systems
Viaarxiv icon

AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning

Feb 02, 2021
Yuhan Liu, Saurabh Agarwal, Shivaram Venkataraman

Figure 1 for AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Figure 2 for AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Figure 3 for AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Figure 4 for AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Viaarxiv icon

Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification

Oct 29, 2020
Saurabh Agarwal, Hongyi Wang, Kangwook Lee, Shivaram Venkataraman, Dimitris Papailiopoulos

Figure 1 for Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
Figure 2 for Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
Figure 3 for Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
Figure 4 for Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
Viaarxiv icon