Picture for Beidi Chen

Beidi Chen

Kinetics: Rethinking Test-Time Scaling Laws

Add code
Jun 06, 2025
Viaarxiv icon

Scalable LLM Math Reasoning Acceleration with Low-rank Distillation

Add code
May 08, 2025
Viaarxiv icon

HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

Add code
Feb 18, 2025
Viaarxiv icon

APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding

Add code
Feb 08, 2025
Viaarxiv icon

GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?

Add code
Feb 07, 2025
Viaarxiv icon

Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation

Add code
Feb 05, 2025
Figure 1 for Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Figure 2 for Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Figure 3 for Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Figure 4 for Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Viaarxiv icon

S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity

Add code
Dec 10, 2024
Figure 1 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Figure 2 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Figure 3 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Figure 4 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Viaarxiv icon

On the Surprising Effectiveness of Attention Transfer for Vision Transformers

Add code
Nov 14, 2024
Figure 1 for On the Surprising Effectiveness of Attention Transfer for Vision Transformers
Figure 2 for On the Surprising Effectiveness of Attention Transfer for Vision Transformers
Figure 3 for On the Surprising Effectiveness of Attention Transfer for Vision Transformers
Figure 4 for On the Surprising Effectiveness of Attention Transfer for Vision Transformers
Viaarxiv icon

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Add code
Oct 28, 2024
Viaarxiv icon

MagicPIG: LSH Sampling for Efficient LLM Generation

Add code
Oct 21, 2024
Figure 1 for MagicPIG: LSH Sampling for Efficient LLM Generation
Figure 2 for MagicPIG: LSH Sampling for Efficient LLM Generation
Figure 3 for MagicPIG: LSH Sampling for Efficient LLM Generation
Figure 4 for MagicPIG: LSH Sampling for Efficient LLM Generation
Viaarxiv icon