Alert button

DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging

Feb 04, 2024
Matteo Pagliardini, Amirkeivan Mohtashami, Francois Fleuret, Martin Jaggi

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: