Picture for Sehoon Kim

Sehoon Kim

Characterizing Prompt Compression Methods for Long Context Inference

Add code
Jul 11, 2024
Viaarxiv icon

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Add code
Mar 22, 2024
Viaarxiv icon

AI and Memory Wall

Add code
Mar 21, 2024
Figure 1 for AI and Memory Wall
Figure 2 for AI and Memory Wall
Figure 3 for AI and Memory Wall
Figure 4 for AI and Memory Wall
Viaarxiv icon

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Add code
Feb 07, 2024
Viaarxiv icon

Learned Best-Effort LLM Serving

Add code
Jan 15, 2024
Viaarxiv icon

An LLM Compiler for Parallel Function Calling

Add code
Dec 07, 2023
Figure 1 for An LLM Compiler for Parallel Function Calling
Figure 2 for An LLM Compiler for Parallel Function Calling
Figure 3 for An LLM Compiler for Parallel Function Calling
Figure 4 for An LLM Compiler for Parallel Function Calling
Viaarxiv icon

SPEED: Speculative Pipelined Execution for Efficient Decoding

Add code
Oct 18, 2023
Figure 1 for SPEED: Speculative Pipelined Execution for Efficient Decoding
Figure 2 for SPEED: Speculative Pipelined Execution for Efficient Decoding
Figure 3 for SPEED: Speculative Pipelined Execution for Efficient Decoding
Figure 4 for SPEED: Speculative Pipelined Execution for Efficient Decoding
Viaarxiv icon

SqueezeLLM: Dense-and-Sparse Quantization

Add code
Jun 13, 2023
Figure 1 for SqueezeLLM: Dense-and-Sparse Quantization
Figure 2 for SqueezeLLM: Dense-and-Sparse Quantization
Figure 3 for SqueezeLLM: Dense-and-Sparse Quantization
Figure 4 for SqueezeLLM: Dense-and-Sparse Quantization
Viaarxiv icon

Full Stack Optimization of Transformer Inference: a Survey

Add code
Feb 27, 2023
Figure 1 for Full Stack Optimization of Transformer Inference: a Survey
Figure 2 for Full Stack Optimization of Transformer Inference: a Survey
Figure 3 for Full Stack Optimization of Transformer Inference: a Survey
Figure 4 for Full Stack Optimization of Transformer Inference: a Survey
Viaarxiv icon

Big Little Transformer Decoder

Add code
Feb 15, 2023
Figure 1 for Big Little Transformer Decoder
Figure 2 for Big Little Transformer Decoder
Figure 3 for Big Little Transformer Decoder
Figure 4 for Big Little Transformer Decoder
Viaarxiv icon