Picture for Zayd M. K. Zuhri

Zayd M. K. Zuhri

Predicting the Order of Upcoming Tokens Improves Language Modeling

Add code
Aug 26, 2025
Viaarxiv icon

Softpick: No Attention Sink, No Massive Activations with Rectified Softmax

Add code
Apr 29, 2025
Viaarxiv icon