Picture for Xiaoxuan Liu

Xiaoxuan Liu

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

Add code
Jun 20, 2024
Figure 1 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 2 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 3 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 4 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Viaarxiv icon

Towards Clinical AI Fairness: Filling Gaps in the Puzzle

Add code
May 28, 2024
Viaarxiv icon

Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

Add code
Apr 22, 2024
Viaarxiv icon

Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native

Add code
Jan 17, 2024
Figure 1 for Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native
Figure 2 for Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native
Viaarxiv icon

Learned Best-Effort LLM Serving

Add code
Jan 15, 2024
Viaarxiv icon

Online Speculative Decoding

Add code
Oct 17, 2023
Figure 1 for Online Speculative Decoding
Figure 2 for Online Speculative Decoding
Figure 3 for Online Speculative Decoding
Figure 4 for Online Speculative Decoding
Viaarxiv icon

QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources

Add code
Oct 11, 2023
Viaarxiv icon

What is the State of Memory Saving for Model Training?

Add code
Mar 26, 2023
Figure 1 for What is the State of Memory Saving for Model Training?
Figure 2 for What is the State of Memory Saving for Model Training?
Figure 3 for What is the State of Memory Saving for Model Training?
Figure 4 for What is the State of Memory Saving for Model Training?
Viaarxiv icon

GACT: Activation Compressed Training for General Architectures

Add code
Jun 28, 2022
Figure 1 for GACT: Activation Compressed Training for General Architectures
Figure 2 for GACT: Activation Compressed Training for General Architectures
Figure 3 for GACT: Activation Compressed Training for General Architectures
Figure 4 for GACT: Activation Compressed Training for General Architectures
Viaarxiv icon

Long-run User Value Optimization in Recommender Systems through Content Creation Modeling

Add code
Apr 25, 2022
Figure 1 for Long-run User Value Optimization in Recommender Systems through Content Creation Modeling
Viaarxiv icon