Picture for Shuming Ma

Shuming Ma

DeepNet: Scaling Transformers to 1,000 Layers

Add code
Mar 01, 2022
Figure 1 for DeepNet: Scaling Transformers to 1,000 Layers
Figure 2 for DeepNet: Scaling Transformers to 1,000 Layers
Figure 3 for DeepNet: Scaling Transformers to 1,000 Layers
Figure 4 for DeepNet: Scaling Transformers to 1,000 Layers
Viaarxiv icon

Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt

Add code
Feb 23, 2022
Figure 1 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Figure 2 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Figure 3 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Figure 4 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Viaarxiv icon

A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model

Add code
Jan 26, 2022
Figure 1 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Figure 2 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Figure 3 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Figure 4 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Viaarxiv icon

Phrase-level Adversarial Example Generation for Neural Machine Translation

Add code
Jan 06, 2022
Figure 1 for Phrase-level Adversarial Example Generation for Neural Machine Translation
Figure 2 for Phrase-level Adversarial Example Generation for Neural Machine Translation
Figure 3 for Phrase-level Adversarial Example Generation for Neural Machine Translation
Figure 4 for Phrase-level Adversarial Example Generation for Neural Machine Translation
Viaarxiv icon

SMDT: Selective Memory-Augmented Neural Document Translation

Add code
Jan 05, 2022
Figure 1 for SMDT: Selective Memory-Augmented Neural Document Translation
Figure 2 for SMDT: Selective Memory-Augmented Neural Document Translation
Figure 3 for SMDT: Selective Memory-Augmented Neural Document Translation
Figure 4 for SMDT: Selective Memory-Augmented Neural Document Translation
Viaarxiv icon

Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task

Add code
Nov 03, 2021
Figure 1 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 2 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 3 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 4 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Viaarxiv icon

Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation

Add code
Oct 16, 2021
Figure 1 for Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation
Figure 2 for Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation
Figure 3 for Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation
Figure 4 for Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation
Viaarxiv icon

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

Add code
Jun 30, 2021
Figure 1 for XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
Figure 2 for XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
Figure 3 for XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
Figure 4 for XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
Viaarxiv icon

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

Add code
Jun 25, 2021
Figure 1 for DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Figure 2 for DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Figure 3 for DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Figure 4 for DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Viaarxiv icon

How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?

Add code
May 27, 2021
Figure 1 for How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?
Figure 2 for How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?
Figure 3 for How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?
Figure 4 for How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?
Viaarxiv icon