Picture for Ion Stoica

Ion Stoica

MPC-Minimized Secure LLM Inference

Add code
Aug 07, 2024
Figure 1 for MPC-Minimized Secure LLM Inference
Figure 2 for MPC-Minimized Secure LLM Inference
Figure 3 for MPC-Minimized Secure LLM Inference
Figure 4 for MPC-Minimized Secure LLM Inference
Viaarxiv icon

Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design

Add code
Jul 23, 2024
Figure 1 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Figure 2 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Figure 3 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Figure 4 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Viaarxiv icon

RouteLLM: Learning to Route LLMs with Preference Data

Add code
Jun 26, 2024
Figure 1 for RouteLLM: Learning to Route LLMs with Preference Data
Figure 2 for RouteLLM: Learning to Route LLMs with Preference Data
Figure 3 for RouteLLM: Learning to Route LLMs with Preference Data
Figure 4 for RouteLLM: Learning to Route LLMs with Preference Data
Viaarxiv icon

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

Add code
Jun 20, 2024
Figure 1 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 2 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 3 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 4 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Viaarxiv icon

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

Add code
Jun 17, 2024
Figure 1 for From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
Figure 2 for From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
Figure 3 for From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
Figure 4 for From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
Viaarxiv icon

OR-Bench: An Over-Refusal Benchmark for Large Language Models

Add code
May 31, 2024
Figure 1 for OR-Bench: An Over-Refusal Benchmark for Large Language Models
Figure 2 for OR-Bench: An Over-Refusal Benchmark for Large Language Models
Figure 3 for OR-Bench: An Over-Refusal Benchmark for Large Language Models
Figure 4 for OR-Bench: An Over-Refusal Benchmark for Large Language Models
Viaarxiv icon

Crafting Interpretable Embeddings by Asking LLMs Questions

Add code
May 26, 2024
Viaarxiv icon

Stylus: Automatic Adapter Selection for Diffusion Models

Add code
Apr 29, 2024
Figure 1 for Stylus: Automatic Adapter Selection for Diffusion Models
Figure 2 for Stylus: Automatic Adapter Selection for Diffusion Models
Figure 3 for Stylus: Automatic Adapter Selection for Diffusion Models
Figure 4 for Stylus: Automatic Adapter Selection for Diffusion Models
Viaarxiv icon

Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

Add code
Apr 22, 2024
Figure 1 for Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Figure 2 for Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Figure 3 for Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Figure 4 for Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Viaarxiv icon

GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications

Add code
Apr 10, 2024
Viaarxiv icon