Picture for Songlin Yang

Songlin Yang

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

Add code
Jun 05, 2025
Viaarxiv icon

Log-Linear Attention

Add code
Jun 05, 2025
Viaarxiv icon

Test-Time Training Done Right

Add code
May 29, 2025
Viaarxiv icon

PaTH Attention: Position Encoding via Accumulating Householder Transformations

Add code
May 22, 2025
Viaarxiv icon

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Add code
May 10, 2025
Viaarxiv icon

Inductive Spatio-Temporal Kriging with Physics-Guided Increment Training Strategy for Air Quality Inference

Add code
Mar 12, 2025
Viaarxiv icon

Textured 3D Regenerative Morphing with 3D Diffusion Prior

Add code
Feb 20, 2025
Viaarxiv icon

ARFlow: Autogressive Flow with Hybrid Linear Attention

Add code
Jan 27, 2025
Figure 1 for ARFlow: Autogressive Flow with Hybrid Linear Attention
Figure 2 for ARFlow: Autogressive Flow with Hybrid Linear Attention
Figure 3 for ARFlow: Autogressive Flow with Hybrid Linear Attention
Figure 4 for ARFlow: Autogressive Flow with Hybrid Linear Attention
Viaarxiv icon

Gated Delta Networks: Improving Mamba2 with Delta Rule

Add code
Dec 09, 2024
Viaarxiv icon

Stick-breaking Attention

Add code
Oct 23, 2024
Figure 1 for Stick-breaking Attention
Figure 2 for Stick-breaking Attention
Figure 3 for Stick-breaking Attention
Figure 4 for Stick-breaking Attention
Viaarxiv icon