Alert button
Picture for Ion Stoica

Ion Stoica

Alert button

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Feb 03, 2024
Yichao Fu, Peter Bailis, Ion Stoica, Hao Zhang

Viaarxiv icon

Fairness in Serving Large Language Models

Dec 31, 2023
Ying Sheng, Shiyi Cao, Dacheng Li, Banghua Zhu, Zhuohan Li, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica

Viaarxiv icon

SuperServe: Fine-Grained Inference Serving for Unpredictable Workloads

Dec 27, 2023
Alind Khare, Dhruv Garg, Sukrit Kalra, Snigdha Grandhi, Ion Stoica, Alexey Tumanov

Viaarxiv icon

CodeScholar: Growing Idiomatic Code Examples

Dec 23, 2023
Manish Shetty, Koushik Sen, Ion Stoica

Viaarxiv icon

Efficiently Programming Large Language Models using SGLang

Dec 12, 2023
Lianmin Zheng, Liangsheng Yin, Zhiqiang Xie, Jeff Huang, Chuyue Sun, Cody Hao Yu, Shiyi Cao, Christos Kozyrakis, Ion Stoica, Joseph E. Gonzalez, Clark Barrett, Ying Sheng

Viaarxiv icon

LLM-Assisted Code Cleaning For Training Accurate Code Generators

Nov 25, 2023
Naman Jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica

Viaarxiv icon

Rethinking Benchmark and Contamination for Language Models with Rephrased Samples

Nov 11, 2023
Shuo Yang, Wei-Lin Chiang, Lianmin Zheng, Joseph E. Gonzalez, Ion Stoica

Viaarxiv icon

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Nov 07, 2023
Ying Sheng, Shiyi Cao, Dacheng Li, Coleman Hooper, Nicholas Lee, Shuo Yang, Christopher Chou, Banghua Zhu, Lianmin Zheng, Kurt Keutzer, Joseph E. Gonzalez, Ion Stoica

Viaarxiv icon

Online Speculative Decoding

Oct 17, 2023
Xiaoxuan Liu, Lanxiang Hu, Peter Bailis, Ion Stoica, Zhijie Deng, Alvin Cheung, Hao Zhang

Figure 1 for Online Speculative Decoding
Figure 2 for Online Speculative Decoding
Figure 3 for Online Speculative Decoding
Figure 4 for Online Speculative Decoding
Viaarxiv icon