Alert button
Picture for Shaohan Huang

Shaohan Huang

Alert button

THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption

Jun 02, 2022
Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, Jianxin Li, Furu Wei

Viaarxiv icon

Task-Specific Expert Pruning for Sparse Mixture-of-Experts

Jun 01, 2022
Tianyu Chen, Shaohan Huang, Yuan Xie, Binxing Jiao, Daxin Jiang, Haoyi Zhou, Jianxin Li, Furu Wei

Figure 1 for Task-Specific Expert Pruning for Sparse Mixture-of-Experts
Figure 2 for Task-Specific Expert Pruning for Sparse Mixture-of-Experts
Figure 3 for Task-Specific Expert Pruning for Sparse Mixture-of-Experts
Figure 4 for Task-Specific Expert Pruning for Sparse Mixture-of-Experts
Viaarxiv icon

On the Representation Collapse of Sparse Mixture of Experts

Apr 20, 2022
Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Furu Wei

Figure 1 for On the Representation Collapse of Sparse Mixture of Experts
Figure 2 for On the Representation Collapse of Sparse Mixture of Experts
Figure 3 for On the Representation Collapse of Sparse Mixture of Experts
Figure 4 for On the Representation Collapse of Sparse Mixture of Experts
Viaarxiv icon

DeepNet: Scaling Transformers to 1,000 Layers

Mar 01, 2022
Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Furu Wei

Figure 1 for DeepNet: Scaling Transformers to 1,000 Layers
Figure 2 for DeepNet: Scaling Transformers to 1,000 Layers
Figure 3 for DeepNet: Scaling Transformers to 1,000 Layers
Figure 4 for DeepNet: Scaling Transformers to 1,000 Layers
Viaarxiv icon

Kformer: Knowledge Injection in Transformer Feed-Forward Layers

Jan 15, 2022
Yunzhi Yao, Shaohan Huang, Ningyu Zhang, Li Dong, Furu Wei, Huajun Chen

Figure 1 for Kformer: Knowledge Injection in Transformer Feed-Forward Layers
Figure 2 for Kformer: Knowledge Injection in Transformer Feed-Forward Layers
Figure 3 for Kformer: Knowledge Injection in Transformer Feed-Forward Layers
Figure 4 for Kformer: Knowledge Injection in Transformer Feed-Forward Layers
Viaarxiv icon

PromptBERT: Improving BERT Sentence Embeddings with Prompts

Jan 12, 2022
Ting Jiang, Shaohan Huang, Zihan Zhang, Deqing Wang, Fuzhen Zhuang, Furu Wei, Haizhen Huang, Liangjie Zhang, Qi Zhang

Figure 1 for PromptBERT: Improving BERT Sentence Embeddings with Prompts
Figure 2 for PromptBERT: Improving BERT Sentence Embeddings with Prompts
Figure 3 for PromptBERT: Improving BERT Sentence Embeddings with Prompts
Figure 4 for PromptBERT: Improving BERT Sentence Embeddings with Prompts
Viaarxiv icon

Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task

Nov 03, 2021
Jian Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Li Dong, Shaohan Huang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

Figure 1 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 2 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 3 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 4 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Viaarxiv icon

Improving Non-autoregressive Generation with Mixup Training

Oct 21, 2021
Ting Jiang, Shaohan Huang, Zihan Zhang, Deqing Wang, Fuzhen Zhuang, Furu Wei, Haizhen Huang, Liangjie Zhang, Qi Zhang

Figure 1 for Improving Non-autoregressive Generation with Mixup Training
Figure 2 for Improving Non-autoregressive Generation with Mixup Training
Figure 3 for Improving Non-autoregressive Generation with Mixup Training
Figure 4 for Improving Non-autoregressive Generation with Mixup Training
Viaarxiv icon

Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training

Sep 15, 2021
Bo Zheng, Li Dong, Shaohan Huang, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei

Figure 1 for Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
Figure 2 for Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
Figure 3 for Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
Figure 4 for Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
Viaarxiv icon