Picture for Alvin Cheung

Alvin Cheung

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

Add code
Jun 20, 2024
Figure 1 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 2 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 3 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 4 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Viaarxiv icon

Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

Add code
Apr 22, 2024
Viaarxiv icon

Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks

Add code
Mar 07, 2024
Figure 1 for Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Figure 2 for Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Figure 3 for Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Figure 4 for Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Viaarxiv icon

AST-T5: Structure-Aware Pretraining for Code Generation and Understanding

Add code
Jan 05, 2024
Viaarxiv icon

Online Speculative Decoding

Add code
Oct 17, 2023
Figure 1 for Online Speculative Decoding
Figure 2 for Online Speculative Decoding
Figure 3 for Online Speculative Decoding
Figure 4 for Online Speculative Decoding
Viaarxiv icon

Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations

Add code
Aug 08, 2023
Figure 1 for Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations
Figure 2 for Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations
Figure 3 for Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations
Figure 4 for Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations
Viaarxiv icon

SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics

Add code
May 29, 2023
Figure 1 for SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics
Figure 2 for SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics
Figure 3 for SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics
Figure 4 for SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics
Viaarxiv icon

Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers

Add code
May 21, 2023
Figure 1 for Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers
Figure 2 for Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers
Figure 3 for Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers
Figure 4 for Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers
Viaarxiv icon

What is the State of Memory Saving for Model Training?

Add code
Mar 26, 2023
Figure 1 for What is the State of Memory Saving for Model Training?
Figure 2 for What is the State of Memory Saving for Model Training?
Figure 3 for What is the State of Memory Saving for Model Training?
Figure 4 for What is the State of Memory Saving for Model Training?
Viaarxiv icon

ADELT: Transpilation Between Deep Learning Frameworks

Add code
Mar 07, 2023
Figure 1 for ADELT: Transpilation Between Deep Learning Frameworks
Figure 2 for ADELT: Transpilation Between Deep Learning Frameworks
Figure 3 for ADELT: Transpilation Between Deep Learning Frameworks
Figure 4 for ADELT: Transpilation Between Deep Learning Frameworks
Viaarxiv icon