Picture for Yehan Ma

Yehan Ma

TimeBill: Time-Budgeted Inference for Large Language Models

Add code
Dec 26, 2025
Figure 1 for TimeBill: Time-Budgeted Inference for Large Language Models
Figure 2 for TimeBill: Time-Budgeted Inference for Large Language Models
Figure 3 for TimeBill: Time-Budgeted Inference for Large Language Models
Figure 4 for TimeBill: Time-Budgeted Inference for Large Language Models
Viaarxiv icon

CUDA-LLM: LLMs Can Write Efficient CUDA Kernels

Add code
Jun 10, 2025
Figure 1 for CUDA-LLM: LLMs Can Write Efficient CUDA Kernels
Figure 2 for CUDA-LLM: LLMs Can Write Efficient CUDA Kernels
Figure 3 for CUDA-LLM: LLMs Can Write Efficient CUDA Kernels
Figure 4 for CUDA-LLM: LLMs Can Write Efficient CUDA Kernels
Viaarxiv icon

Predictive Exit: Prediction of Fine-Grained Early Exits for Computation- and Energy-Efficient Inference

Add code
Jun 09, 2022
Figure 1 for Predictive Exit: Prediction of Fine-Grained Early Exits for Computation- and Energy-Efficient Inference
Figure 2 for Predictive Exit: Prediction of Fine-Grained Early Exits for Computation- and Energy-Efficient Inference
Figure 3 for Predictive Exit: Prediction of Fine-Grained Early Exits for Computation- and Energy-Efficient Inference
Figure 4 for Predictive Exit: Prediction of Fine-Grained Early Exits for Computation- and Energy-Efficient Inference
Viaarxiv icon