Picture for Shaoyuan Chen

Shaoyuan Chen

OTARo: Once Tuning for All Precisions toward Robust On-Device LLMs

Add code
Nov 17, 2025
Viaarxiv icon

Federated Knowledge Transfer Fine-tuning Large Server Model with Resource-Constrained IoT Clients

Add code
Jul 07, 2024
Viaarxiv icon

Efficient and Economic Large Language Model Inference with Attention Offloading

Add code
May 03, 2024
Figure 1 for Efficient and Economic Large Language Model Inference with Attention Offloading
Figure 2 for Efficient and Economic Large Language Model Inference with Attention Offloading
Figure 3 for Efficient and Economic Large Language Model Inference with Attention Offloading
Figure 4 for Efficient and Economic Large Language Model Inference with Attention Offloading
Viaarxiv icon