Picture for Hai Zhao

Hai Zhao

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University

Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache Consumption

Add code
Jul 28, 2024
Figure 1 for Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache Consumption
Figure 2 for Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache Consumption
Figure 3 for Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache Consumption
Figure 4 for Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache Consumption
Viaarxiv icon

DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems

Add code
Jul 15, 2024
Figure 1 for DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
Figure 2 for DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
Figure 3 for DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
Figure 4 for DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
Viaarxiv icon

Hypergraph based Understanding for Document Semantic Entity Recognition

Add code
Jul 09, 2024
Viaarxiv icon

Venturing into Uncharted Waters: The Navigation Compass from Transformer to Mamba

Add code
Jun 24, 2024
Viaarxiv icon

The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models

Add code
Jun 22, 2024
Viaarxiv icon

Vript: A Video Is Worth Thousands of Words

Add code
Jun 10, 2024
Figure 1 for Vript: A Video Is Worth Thousands of Words
Figure 2 for Vript: A Video Is Worth Thousands of Words
Figure 3 for Vript: A Video Is Worth Thousands of Words
Figure 4 for Vript: A Video Is Worth Thousands of Words
Viaarxiv icon

GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment

Add code
May 30, 2024
Viaarxiv icon

From Role-Play to Drama-Interaction: An LLM Solution

Add code
May 23, 2024
Viaarxiv icon

PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference

Add code
May 21, 2024
Figure 1 for PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
Figure 2 for PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
Figure 3 for PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
Figure 4 for PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
Viaarxiv icon

SirLLM: Streaming Infinite Retentive LLM

Add code
May 21, 2024
Viaarxiv icon