Picture for Boxing Chen

Boxing Chen

Huawei Noah's Ark Lab

Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts

Add code
Aug 13, 2025
Viaarxiv icon

PoTPTQ: A Two-step Power-of-Two Post-training for LLMs

Add code
Jul 16, 2025
Viaarxiv icon

ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training

Add code
May 22, 2025
Viaarxiv icon

Resona: Improving Context Copying in Linear Recurrence Models with Retrieval

Add code
Mar 28, 2025
Viaarxiv icon

Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models

Add code
Mar 06, 2025
Viaarxiv icon

R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning

Add code
Feb 27, 2025
Viaarxiv icon

ReGLA: Refining Gated Linear Attention

Add code
Feb 03, 2025
Viaarxiv icon

ZETA: Leveraging Z-order Curves for Efficient Top-k Attention

Add code
Jan 24, 2025
Viaarxiv icon

Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach

Add code
Jan 15, 2025
Figure 1 for Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Figure 2 for Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Figure 3 for Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Figure 4 for Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Viaarxiv icon

Batch-Max: Higher LLM Throughput using Larger Batch Sizes and KV Cache Compression

Add code
Dec 07, 2024
Viaarxiv icon