Alert button

The Impact of Depth and Width on Transformer Language Model Generalization

Oct 30, 2023
Jackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta, Fei Sha, Dan Garrette, Tal Linzen

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: