Picture for Sunghyeon Woo

Sunghyeon Woo

SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

Add code
Mar 03, 2026
Viaarxiv icon

Affine-Scaled Attention: Towards Flexible and Stable Transformer Attention

Add code
Feb 26, 2026
Viaarxiv icon

PrefillShare: A Shared Prefill Module for KV Reuse in Multi-LLM Disaggregated Serving

Add code
Feb 12, 2026
Viaarxiv icon

DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation

Add code
Feb 27, 2024
Figure 1 for DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Figure 2 for DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Figure 3 for DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Figure 4 for DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Viaarxiv icon