Picture for Hany Hassan Awadalla

Hany Hassan Awadalla

Microsoft Redmond

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

Add code
Aug 21, 2022
Figure 1 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Figure 2 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Figure 3 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Figure 4 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Viaarxiv icon

Language Tokens: A Frustratingly Simple Approach Improves Zero-Shot Performance of Multilingual Translation

Add code
Aug 11, 2022
Figure 1 for Language Tokens: A Frustratingly Simple Approach Improves Zero-Shot Performance of Multilingual Translation
Figure 2 for Language Tokens: A Frustratingly Simple Approach Improves Zero-Shot Performance of Multilingual Translation
Figure 3 for Language Tokens: A Frustratingly Simple Approach Improves Zero-Shot Performance of Multilingual Translation
Figure 4 for Language Tokens: A Frustratingly Simple Approach Improves Zero-Shot Performance of Multilingual Translation
Viaarxiv icon

Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations

Add code
Jun 30, 2022
Figure 1 for Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations
Figure 2 for Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations
Figure 3 for Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations
Figure 4 for Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations
Viaarxiv icon

Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers

Add code
May 28, 2022
Figure 1 for Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
Figure 2 for Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
Figure 3 for Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
Figure 4 for Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
Viaarxiv icon

Ensembling of Distilled Models from Multi-task Teachers for Constrained Resource Language Pairs

Add code
Nov 26, 2021
Figure 1 for Ensembling of Distilled Models from Multi-task Teachers for Constrained Resource Language Pairs
Figure 2 for Ensembling of Distilled Models from Multi-task Teachers for Constrained Resource Language Pairs
Figure 3 for Ensembling of Distilled Models from Multi-task Teachers for Constrained Resource Language Pairs
Figure 4 for Ensembling of Distilled Models from Multi-task Teachers for Constrained Resource Language Pairs
Viaarxiv icon

Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task

Add code
Nov 03, 2021
Figure 1 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 2 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 3 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Figure 4 for Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Viaarxiv icon

Scalable and Efficient MoE Training for Multitask Multilingual Models

Add code
Sep 22, 2021
Figure 1 for Scalable and Efficient MoE Training for Multitask Multilingual Models
Figure 2 for Scalable and Efficient MoE Training for Multitask Multilingual Models
Figure 3 for Scalable and Efficient MoE Training for Multitask Multilingual Models
Figure 4 for Scalable and Efficient MoE Training for Multitask Multilingual Models
Viaarxiv icon

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

Add code
Jun 25, 2021
Figure 1 for DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Figure 2 for DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Figure 3 for DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Figure 4 for DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Viaarxiv icon

XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders

Add code
Dec 31, 2020
Figure 1 for XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Figure 2 for XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Figure 3 for XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Figure 4 for XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Viaarxiv icon

Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions

Add code
Nov 16, 2020
Figure 1 for Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions
Figure 2 for Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions
Figure 3 for Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions
Figure 4 for Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions
Viaarxiv icon