Picture for Zeming Ma

Zeming Ma

KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models

Add code
Sep 17, 2024
Figure 1 for KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models
Figure 2 for KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models
Figure 3 for KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models
Figure 4 for KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models
Viaarxiv icon