Alert button

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers

Dec 31, 2020
Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei

Figure 1 for MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Figure 2 for MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Figure 3 for MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Figure 4 for MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: