Picture for Saurav Muralidharan

Saurav Muralidharan

LLM Pruning and Distillation in Practice: The Minitron Approach

Add code
Aug 21, 2024
Viaarxiv icon

Compact Language Models via Pruning and Knowledge Distillation

Add code
Jul 19, 2024
Viaarxiv icon

Flextron: Many-in-One Flexible Large Language Model

Add code
Jun 11, 2024
Figure 1 for Flextron: Many-in-One Flexible Large Language Model
Figure 2 for Flextron: Many-in-One Flexible Large Language Model
Figure 3 for Flextron: Many-in-One Flexible Large Language Model
Figure 4 for Flextron: Many-in-One Flexible Large Language Model
Viaarxiv icon

The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks

Add code
Sep 30, 2023
Figure 1 for The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks
Figure 2 for The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks
Figure 3 for The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks
Figure 4 for The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks
Viaarxiv icon

Understanding the Effect of the Long Tail on Neural Network Compression

Add code
Jun 27, 2023
Figure 1 for Understanding the Effect of the Long Tail on Neural Network Compression
Figure 2 for Understanding the Effect of the Long Tail on Neural Network Compression
Figure 3 for Understanding the Effect of the Long Tail on Neural Network Compression
Figure 4 for Understanding the Effect of the Long Tail on Neural Network Compression
Viaarxiv icon

HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity

Add code
May 22, 2023
Figure 1 for HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Figure 2 for HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Figure 3 for HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Figure 4 for HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Viaarxiv icon

Efficient Sparsely Activated Transformers

Add code
Aug 31, 2022
Figure 1 for Efficient Sparsely Activated Transformers
Figure 2 for Efficient Sparsely Activated Transformers
Figure 3 for Efficient Sparsely Activated Transformers
Figure 4 for Efficient Sparsely Activated Transformers
Viaarxiv icon

Reliable Model Compression via Label-Preservation-Aware Loss Functions

Add code
Dec 03, 2020
Figure 1 for Reliable Model Compression via Label-Preservation-Aware Loss Functions
Figure 2 for Reliable Model Compression via Label-Preservation-Aware Loss Functions
Figure 3 for Reliable Model Compression via Label-Preservation-Aware Loss Functions
Figure 4 for Reliable Model Compression via Label-Preservation-Aware Loss Functions
Viaarxiv icon

A Programmable Approach to Model Compression

Add code
Nov 06, 2019
Figure 1 for A Programmable Approach to Model Compression
Figure 2 for A Programmable Approach to Model Compression
Figure 3 for A Programmable Approach to Model Compression
Figure 4 for A Programmable Approach to Model Compression
Viaarxiv icon