Picture for Yuchen Xie

Yuchen Xie

FG$^2$-GDN: Enhancing Long-Context Gated Delta Networks with Doubly Fine-Grained Control

Add code
Apr 21, 2026
Viaarxiv icon

SparseBalance: Load-Balanced Long Context Training with Dynamic Sparse Attention

Add code
Apr 15, 2026
Viaarxiv icon

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Add code
Apr 11, 2026
Viaarxiv icon

AsyncTLS: Efficient Generative LLM Inference with Asynchronous Two-level Sparse Attention

Add code
Apr 09, 2026
Viaarxiv icon

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Add code
Mar 29, 2026
Viaarxiv icon

A Geometry-Adaptive Deep Variational Framework for Phase Discovery in the Landau-Brazovskii Model

Add code
Mar 05, 2026
Viaarxiv icon

SnapMLA: Efficient Long-Context MLA Decoding via Hardware-Aware FP8 Quantized Pipelining

Add code
Feb 12, 2026
Viaarxiv icon

Scaling Embeddings Outperforms Scaling Experts in Language Models

Add code
Jan 29, 2026
Viaarxiv icon

LongCat-Flash-Thinking-2601 Technical Report

Add code
Jan 23, 2026
Viaarxiv icon

EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

Add code
Jan 23, 2026
Viaarxiv icon