Picture for Minsik Cho

Minsik Cho

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Add code
Jul 19, 2024
Figure 1 for LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
Figure 2 for LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
Figure 3 for LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
Figure 4 for LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
Viaarxiv icon

KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation

Add code
May 08, 2024
Figure 1 for KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
Figure 2 for KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
Figure 3 for KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
Figure 4 for KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
Viaarxiv icon

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Add code
Dec 12, 2023
Figure 1 for LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Figure 2 for LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Figure 3 for LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Figure 4 for LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Viaarxiv icon

Prompting might be all you need to repair Compressed LLMs

Add code
Oct 14, 2023
Figure 1 for  Prompting might be all you need to repair Compressed LLMs
Figure 2 for  Prompting might be all you need to repair Compressed LLMs
Figure 3 for  Prompting might be all you need to repair Compressed LLMs
Figure 4 for  Prompting might be all you need to repair Compressed LLMs
Viaarxiv icon

Streaming Anchor Loss: Augmenting Supervision with Temporal Significance

Add code
Oct 09, 2023
Figure 1 for Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
Figure 2 for Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
Figure 3 for Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
Figure 4 for Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
Viaarxiv icon

eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models

Add code
Sep 13, 2023
Figure 1 for eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Figure 2 for eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Figure 3 for eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Figure 4 for eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Viaarxiv icon

Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding

Add code
Aug 12, 2023
Figure 1 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Figure 2 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Figure 3 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Figure 4 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Viaarxiv icon

Matching Latent Encoding for Audio-Text based Keyword Spotting

Add code
Jun 08, 2023
Figure 1 for Matching Latent Encoding for Audio-Text based Keyword Spotting
Figure 2 for Matching Latent Encoding for Audio-Text based Keyword Spotting
Figure 3 for Matching Latent Encoding for Audio-Text based Keyword Spotting
Figure 4 for Matching Latent Encoding for Audio-Text based Keyword Spotting
Viaarxiv icon

PDP: Parameter-free Differentiable Pruning is All You Need

Add code
May 18, 2023
Figure 1 for PDP: Parameter-free Differentiable Pruning is All You Need
Figure 2 for PDP: Parameter-free Differentiable Pruning is All You Need
Figure 3 for PDP: Parameter-free Differentiable Pruning is All You Need
Figure 4 for PDP: Parameter-free Differentiable Pruning is All You Need
Viaarxiv icon

R^2: Range Regularization for Model Compression and Quantization

Add code
Mar 14, 2023
Figure 1 for R^2: Range Regularization for Model Compression and Quantization
Figure 2 for R^2: Range Regularization for Model Compression and Quantization
Figure 3 for R^2: Range Regularization for Model Compression and Quantization
Figure 4 for R^2: Range Regularization for Model Compression and Quantization
Viaarxiv icon