Picture for Saurav Muralidharan

Saurav Muralidharan

Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning

Add code
Apr 15, 2025
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Viaarxiv icon

EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation

Add code
Oct 28, 2024
Viaarxiv icon

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Add code
Sep 26, 2024
Figure 1 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Figure 2 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Figure 3 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Figure 4 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Viaarxiv icon

LLM Pruning and Distillation in Practice: The Minitron Approach

Add code
Aug 21, 2024
Figure 1 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 2 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 3 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 4 for LLM Pruning and Distillation in Practice: The Minitron Approach
Viaarxiv icon

Compact Language Models via Pruning and Knowledge Distillation

Add code
Jul 19, 2024
Figure 1 for Compact Language Models via Pruning and Knowledge Distillation
Figure 2 for Compact Language Models via Pruning and Knowledge Distillation
Figure 3 for Compact Language Models via Pruning and Knowledge Distillation
Figure 4 for Compact Language Models via Pruning and Knowledge Distillation
Viaarxiv icon

Flextron: Many-in-One Flexible Large Language Model

Add code
Jun 11, 2024
Figure 1 for Flextron: Many-in-One Flexible Large Language Model
Figure 2 for Flextron: Many-in-One Flexible Large Language Model
Figure 3 for Flextron: Many-in-One Flexible Large Language Model
Figure 4 for Flextron: Many-in-One Flexible Large Language Model
Viaarxiv icon

The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks

Add code
Sep 30, 2023
Viaarxiv icon

Understanding the Effect of the Long Tail on Neural Network Compression

Add code
Jun 27, 2023
Viaarxiv icon

HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity

Add code
May 22, 2023
Figure 1 for HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Figure 2 for HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Figure 3 for HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Figure 4 for HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Viaarxiv icon