Alert button

Transformers generalize differently from information stored in context vs in weights

Oct 11, 2022
Stephanie C. Y. Chan, Ishita Dasgupta, Junkyung Kim, Dharshan Kumaran, Andrew K. Lampinen, Felix Hill

Figure 1 for Transformers generalize differently from information stored in context vs in weights
Figure 2 for Transformers generalize differently from information stored in context vs in weights
Figure 3 for Transformers generalize differently from information stored in context vs in weights
Figure 4 for Transformers generalize differently from information stored in context vs in weights

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: