Picture for Beidi Chen

Beidi Chen

STEM: Scaling Transformers with Embedding Modules

Add code
Jan 15, 2026
Viaarxiv icon

RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs

Add code
Oct 22, 2025
Figure 1 for RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs
Figure 2 for RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs
Figure 3 for RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs
Figure 4 for RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs
Viaarxiv icon

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Add code
Jun 11, 2025
Viaarxiv icon

Kinetics: Rethinking Test-Time Scaling Laws

Add code
Jun 06, 2025
Viaarxiv icon

Scalable LLM Math Reasoning Acceleration with Low-rank Distillation

Add code
May 08, 2025
Viaarxiv icon

HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

Add code
Feb 18, 2025
Figure 1 for HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
Figure 2 for HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
Figure 3 for HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
Figure 4 for HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
Viaarxiv icon

APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding

Add code
Feb 08, 2025
Viaarxiv icon

GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?

Add code
Feb 07, 2025
Viaarxiv icon

Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation

Add code
Feb 05, 2025
Figure 1 for Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Figure 2 for Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Figure 3 for Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Figure 4 for Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Viaarxiv icon

S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity

Add code
Dec 10, 2024
Figure 1 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Figure 2 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Figure 3 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Figure 4 for S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
Viaarxiv icon