Alert button

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers

Add code
Bookmark button
Alert button
Mar 02, 2023
Tianlong Chen, Zhenyu Zhang, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang

Figure 1 for Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Figure 2 for Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Figure 3 for Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Figure 4 for Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers

Share this with someone who'll enjoy it:

View paper onarxiv iconopen_review iconOpenReview

Share this with someone who'll enjoy it: