Picture for Wei-Lin Chiang

Wei-Lin Chiang

RouteLLM: Learning to Route LLMs with Preference Data

Add code
Jun 26, 2024
Viaarxiv icon

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

Add code
Jun 17, 2024
Viaarxiv icon

OR-Bench: An Over-Refusal Benchmark for Large Language Models

Add code
May 31, 2024
Viaarxiv icon

Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

Add code
Apr 22, 2024
Viaarxiv icon

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Add code
Mar 07, 2024
Figure 1 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Figure 2 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Figure 3 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Figure 4 for Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Viaarxiv icon

LLM-Assisted Code Cleaning For Training Accurate Code Generators

Add code
Nov 25, 2023
Figure 1 for LLM-Assisted Code Cleaning For Training Accurate Code Generators
Figure 2 for LLM-Assisted Code Cleaning For Training Accurate Code Generators
Figure 3 for LLM-Assisted Code Cleaning For Training Accurate Code Generators
Figure 4 for LLM-Assisted Code Cleaning For Training Accurate Code Generators
Viaarxiv icon

Rethinking Benchmark and Contamination for Language Models with Rephrased Samples

Add code
Nov 11, 2023
Viaarxiv icon

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Add code
Sep 30, 2023
Figure 1 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Figure 2 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Figure 3 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Figure 4 for LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Viaarxiv icon

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Add code
Jun 09, 2023
Figure 1 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Figure 2 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Figure 3 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Figure 4 for Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Viaarxiv icon

Balsa: Learning a Query Optimizer Without Expert Demonstrations

Add code
Jan 05, 2022
Figure 1 for Balsa: Learning a Query Optimizer Without Expert Demonstrations
Figure 2 for Balsa: Learning a Query Optimizer Without Expert Demonstrations
Figure 3 for Balsa: Learning a Query Optimizer Without Expert Demonstrations
Figure 4 for Balsa: Learning a Query Optimizer Without Expert Demonstrations
Viaarxiv icon