Alert button
Picture for Shuming Ma

Shuming Ma

Alert button

UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation

Add code
Bookmark button
Alert button
Jul 11, 2022
Jian Yang, Yuwei Yin, Shuming Ma, Dongdong Zhang, Shuangzhi Wu, Hongcheng Guo, Zhoujun Li, Furu Wei

Figure 1 for UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation
Figure 2 for UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation
Figure 3 for UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation
Figure 4 for UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation
Viaarxiv icon

Language Models are General-Purpose Interfaces

Add code
Bookmark button
Alert button
Jun 13, 2022
Yaru Hao, Haoyu Song, Li Dong, Shaohan Huang, Zewen Chi, Wenhui Wang, Shuming Ma, Furu Wei

Figure 1 for Language Models are General-Purpose Interfaces
Figure 2 for Language Models are General-Purpose Interfaces
Figure 3 for Language Models are General-Purpose Interfaces
Figure 4 for Language Models are General-Purpose Interfaces
Viaarxiv icon

On the Representation Collapse of Sparse Mixture of Experts

Add code
Bookmark button
Alert button
Apr 20, 2022
Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Furu Wei

Figure 1 for On the Representation Collapse of Sparse Mixture of Experts
Figure 2 for On the Representation Collapse of Sparse Mixture of Experts
Figure 3 for On the Representation Collapse of Sparse Mixture of Experts
Figure 4 for On the Representation Collapse of Sparse Mixture of Experts
Viaarxiv icon

StableMoE: Stable Routing Strategy for Mixture of Experts

Add code
Bookmark button
Alert button
Apr 18, 2022
Damai Dai, Li Dong, Shuming Ma, Bo Zheng, Zhifang Sui, Baobao Chang, Furu Wei

Figure 1 for StableMoE: Stable Routing Strategy for Mixture of Experts
Figure 2 for StableMoE: Stable Routing Strategy for Mixture of Experts
Figure 3 for StableMoE: Stable Routing Strategy for Mixture of Experts
Figure 4 for StableMoE: Stable Routing Strategy for Mixture of Experts
Viaarxiv icon

DeepNet: Scaling Transformers to 1,000 Layers

Add code
Bookmark button
Alert button
Mar 01, 2022
Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Furu Wei

Figure 1 for DeepNet: Scaling Transformers to 1,000 Layers
Figure 2 for DeepNet: Scaling Transformers to 1,000 Layers
Figure 3 for DeepNet: Scaling Transformers to 1,000 Layers
Figure 4 for DeepNet: Scaling Transformers to 1,000 Layers
Viaarxiv icon

Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt

Add code
Bookmark button
Alert button
Feb 23, 2022
Lianzhe Huang, Shuming Ma, Dongdong Zhang, Furu Wei, Houfeng Wang

Figure 1 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Figure 2 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Figure 3 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Figure 4 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Viaarxiv icon

A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model

Add code
Bookmark button
Alert button
Jan 26, 2022
Xin Sun, Tao Ge, Shuming Ma, Jingjing Li, Furu Wei, Houfeng Wang

Figure 1 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Figure 2 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Figure 3 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Figure 4 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Viaarxiv icon

Phrase-level Adversarial Example Generation for Neural Machine Translation

Add code
Bookmark button
Alert button
Jan 06, 2022
Juncheng Wan, Jian Yang, Shuming Ma, Dongdong Zhang, Weinan Zhang, Yong Yu, Furu Wei

Figure 1 for Phrase-level Adversarial Example Generation for Neural Machine Translation
Figure 2 for Phrase-level Adversarial Example Generation for Neural Machine Translation
Figure 3 for Phrase-level Adversarial Example Generation for Neural Machine Translation
Figure 4 for Phrase-level Adversarial Example Generation for Neural Machine Translation
Viaarxiv icon

SMDT: Selective Memory-Augmented Neural Document Translation

Add code
Bookmark button
Alert button
Jan 05, 2022
Xu Zhang, Jian Yang, Haoyang Huang, Shuming Ma, Dongdong Zhang, Jinlong Li, Furu Wei

Figure 1 for SMDT: Selective Memory-Augmented Neural Document Translation
Figure 2 for SMDT: Selective Memory-Augmented Neural Document Translation
Figure 3 for SMDT: Selective Memory-Augmented Neural Document Translation
Figure 4 for SMDT: Selective Memory-Augmented Neural Document Translation
Viaarxiv icon

Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task

Add code
Bookmark button
Alert button
Nov 03, 2021
Jian Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Li Dong, Shaohan Huang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

Figure 1 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 2 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 3 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 4 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Viaarxiv icon