Picture for Shaoyuan Chen

Shaoyuan Chen

Federated Knowledge Transfer Fine-tuning Large Server Model with Resource-Constrained IoT Clients

Add code
Jul 07, 2024
Viaarxiv icon

Efficient and Economic Large Language Model Inference with Attention Offloading

Add code
May 03, 2024
Figure 1 for Efficient and Economic Large Language Model Inference with Attention Offloading
Figure 2 for Efficient and Economic Large Language Model Inference with Attention Offloading
Figure 3 for Efficient and Economic Large Language Model Inference with Attention Offloading
Figure 4 for Efficient and Economic Large Language Model Inference with Attention Offloading
Viaarxiv icon