Picture for Lianmin Zheng

Lianmin Zheng

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Add code
Mar 07, 2024
Figure 1 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Figure 2 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Figure 3 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Figure 4 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Viaarxiv icon

Efficiently Programming Large Language Models using SGLang

Add code
Dec 12, 2023
Figure 1 for Efficiently Programming Large Language Models using SGLang
Figure 2 for Efficiently Programming Large Language Models using SGLang
Figure 3 for Efficiently Programming Large Language Models using SGLang
Figure 4 for Efficiently Programming Large Language Models using SGLang
Viaarxiv icon

Rethinking Benchmark and Contamination for Language Models with Rephrased Samples

Add code
Nov 11, 2023
Viaarxiv icon

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Add code
Nov 07, 2023
Figure 1 for S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Figure 2 for S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Figure 3 for S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Figure 4 for S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Viaarxiv icon

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Add code
Sep 30, 2023
Figure 1 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Figure 2 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Figure 3 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Figure 4 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Viaarxiv icon

Efficient Memory Management for Large Language Model Serving with PagedAttention

Add code
Sep 12, 2023
Figure 1 for Efficient Memory Management for Large Language Model Serving with PagedAttention
Figure 2 for Efficient Memory Management for Large Language Model Serving with PagedAttention
Figure 3 for Efficient Memory Management for Large Language Model Serving with PagedAttention
Figure 4 for Efficient Memory Management for Large Language Model Serving with PagedAttention
Viaarxiv icon

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Add code
Jul 19, 2023
Figure 1 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Figure 2 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Figure 3 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Figure 4 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Viaarxiv icon

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Add code
Jun 09, 2023
Figure 1 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Figure 2 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Figure 3 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Figure 4 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Viaarxiv icon

On Optimal Caching and Model Multiplexing for Large Model Inference

Add code
Jun 03, 2023
Figure 1 for On Optimal Caching and Model Multiplexing for Large Model Inference
Figure 2 for On Optimal Caching and Model Multiplexing for Large Model Inference
Figure 3 for On Optimal Caching and Model Multiplexing for Large Model Inference
Figure 4 for On Optimal Caching and Model Multiplexing for Large Model Inference
Viaarxiv icon

High-throughput Generative Inference of Large Language Models with a Single GPU

Add code
Mar 13, 2023
Figure 1 for High-throughput Generative Inference of Large Language Models with a Single GPU
Figure 2 for High-throughput Generative Inference of Large Language Models with a Single GPU
Figure 3 for High-throughput Generative Inference of Large Language Models with a Single GPU
Figure 4 for High-throughput Generative Inference of Large Language Models with a Single GPU
Viaarxiv icon