Picture for Chaojun Xiao

Chaojun Xiao

MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling

Add code
Feb 12, 2026
Viaarxiv icon

Data Science and Technology Towards AGI Part I: Tiered Data Management

Add code
Feb 09, 2026
Viaarxiv icon

Spava: Accelerating Long-Video Understanding via Sequence-Parallelism-aware Approximate Attention

Add code
Jan 29, 2026
Viaarxiv icon

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Add code
Jan 29, 2026
Viaarxiv icon

Revealing the Attention Floating Mechanism in Masked Diffusion Models

Add code
Jan 12, 2026
Viaarxiv icon

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Add code
Dec 18, 2025
Viaarxiv icon

MiniCPM4: Ultra-Efficient LLMs on End Devices

Add code
Jun 09, 2025
Figure 1 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Figure 2 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Figure 3 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Figure 4 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Viaarxiv icon

Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data

Add code
May 08, 2025
Figure 1 for Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data
Figure 2 for Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data
Figure 3 for Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data
Figure 4 for Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data
Viaarxiv icon

APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs

Add code
Feb 17, 2025
Viaarxiv icon

Densing Law of LLMs

Add code
Dec 05, 2024
Figure 1 for Densing Law of LLMs
Figure 2 for Densing Law of LLMs
Figure 3 for Densing Law of LLMs
Figure 4 for Densing Law of LLMs
Viaarxiv icon