Network Pruning


Network pruning is a popular approach to reduce a heavy network to obtain a lightweight form by removing redundancy in the heavy network. In this approach, a complex over-parameterized network is first trained, then pruned based on some criteria, and finally fine-tuned to achieve comparable performance with reduced parameters.

JGRA: Jacobian Geometry Robustness Assessment in NISQ Noise-Aware Quantum Neural Networks

Add code
Jun 08, 2026
Viaarxiv icon

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

Add code
Jun 07, 2026
Viaarxiv icon

RAPID: Layer-Wise Redundancy-Aware Pruning and Importance-Driven Token Merging for Efficient ViT

Add code
Jun 06, 2026
Viaarxiv icon

SpectrumKV: Per-Token Mixed-Precision KV Cache Transfer for Prefill-Decode Disaggregated LLM Serving

Add code
Jun 07, 2026
Viaarxiv icon

Joint Structural Pruning and Mixed-Precision Quantization for LLM Compression

Add code
Jun 05, 2026
Viaarxiv icon

SFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning

Add code
Jun 03, 2026
Viaarxiv icon

STARFISH: faST Accuracy Recovery in pruned networks From Internal State Healing

Add code
May 31, 2026
Viaarxiv icon

PSViT: A Methodology for Structurally Pruning Spiking Vision Transformers

Add code
Jun 02, 2026
Viaarxiv icon

Zero-Copy Semantic Contagion: An In-Memory Streaming Architecture for Evolving Attention Graphs

Add code
Jun 04, 2026
Viaarxiv icon

PrimeSVT: An Automated Memory-aware Pruning Framework with Prioritized Compression Policy for Spiking Vision Transformers

Add code
Jun 02, 2026
Viaarxiv icon