Alert button
Picture for Xiaodan Song

Xiaodan Song

Alert button

Token Dropping for Efficient BERT Pretraining

Add code
Bookmark button
Alert button
Mar 24, 2022
Le Hou, Richard Yuanzhe Pang, Tianyi Zhou, Yuexin Wu, Xinying Song, Xiaodan Song, Denny Zhou

Figure 1 for Token Dropping for Efficient BERT Pretraining
Figure 2 for Token Dropping for Efficient BERT Pretraining
Figure 3 for Token Dropping for Efficient BERT Pretraining
Figure 4 for Token Dropping for Efficient BERT Pretraining
Viaarxiv icon

Auto-scaling Vision Transformers without Training

Add code
Bookmark button
Alert button
Feb 27, 2022
Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou

Figure 1 for Auto-scaling Vision Transformers without Training
Figure 2 for Auto-scaling Vision Transformers without Training
Figure 3 for Auto-scaling Vision Transformers without Training
Figure 4 for Auto-scaling Vision Transformers without Training
Viaarxiv icon

A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation

Add code
Bookmark button
Alert button
Dec 17, 2021
Wuyang Chen, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou

Figure 1 for A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation
Figure 2 for A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation
Figure 3 for A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation
Figure 4 for A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation
Viaarxiv icon

Speeding up Deep Model Training by Sharing Weights and Then Unsharing

Add code
Bookmark button
Alert button
Oct 08, 2021
Shuo Yang, Le Hou, Xiaodan Song, Qiang Liu, Denny Zhou

Figure 1 for Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Figure 2 for Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Figure 3 for Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Figure 4 for Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Viaarxiv icon

Efficient Scale-Permuted Backbone with Learned Resource Distribution

Add code
Bookmark button
Alert button
Oct 22, 2020
Xianzhi Du, Tsung-Yi Lin, Pengchong Jin, Yin Cui, Mingxing Tan, Quoc Le, Xiaodan Song

Figure 1 for Efficient Scale-Permuted Backbone with Learned Resource Distribution
Figure 2 for Efficient Scale-Permuted Backbone with Learned Resource Distribution
Figure 3 for Efficient Scale-Permuted Backbone with Learned Resource Distribution
Figure 4 for Efficient Scale-Permuted Backbone with Learned Resource Distribution
Viaarxiv icon

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Add code
Bookmark button
Alert button
Jul 01, 2020
Denny Zhou, Mao Ye, Chen Chen, Tianjian Meng, Mingxing Tan, Xiaodan Song, Quoc Le, Qiang Liu, Dale Schuurmans

Figure 1 for Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Figure 2 for Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Figure 3 for Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Figure 4 for Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Viaarxiv icon

MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices

Add code
Bookmark button
Alert button
Apr 14, 2020
Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, Denny Zhou

Figure 1 for MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Figure 2 for MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Figure 3 for MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Figure 4 for MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Viaarxiv icon

BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models

Add code
Bookmark button
Alert button
Mar 24, 2020
Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, Quoc Le

Figure 1 for BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
Figure 2 for BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
Figure 3 for BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
Figure 4 for BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
Viaarxiv icon

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

Add code
Bookmark button
Alert button
Dec 10, 2019
Xianzhi Du, Tsung-Yi Lin, Pengchong Jin, Golnaz Ghiasi, Mingxing Tan, Yin Cui, Quoc V. Le, Xiaodan Song

Figure 1 for SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
Figure 2 for SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
Figure 3 for SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
Figure 4 for SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
Viaarxiv icon