Picture for Songlin Yang

Songlin Yang

Distilling to Hybrid Attention Models via KL-Guided Layer Selection

Add code
Dec 23, 2025
Viaarxiv icon

Revisiting MLLM Based Image Quality Assessment: Errors and Remedy

Add code
Nov 11, 2025
Viaarxiv icon

Kimi Linear: An Expressive, Efficient Attention Architecture

Add code
Oct 30, 2025
Viaarxiv icon

Instant Preference Alignment for Text-to-Image Diffusion Models

Add code
Aug 25, 2025
Viaarxiv icon

Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation

Add code
Jun 24, 2025
Viaarxiv icon

Log-Linear Attention

Add code
Jun 05, 2025
Figure 1 for Log-Linear Attention
Figure 2 for Log-Linear Attention
Figure 3 for Log-Linear Attention
Figure 4 for Log-Linear Attention
Viaarxiv icon

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

Add code
Jun 05, 2025
Viaarxiv icon

Test-Time Training Done Right

Add code
May 29, 2025
Viaarxiv icon

PaTH Attention: Position Encoding via Accumulating Householder Transformations

Add code
May 22, 2025
Viaarxiv icon

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Add code
May 10, 2025
Viaarxiv icon