Picture for Qiuhui Liu

Qiuhui Liu

Learning Hard Retrieval Cross Attention for Transformer

Add code
Sep 30, 2020
Figure 1 for Learning Hard Retrieval Cross Attention for Transformer
Figure 2 for Learning Hard Retrieval Cross Attention for Transformer
Figure 3 for Learning Hard Retrieval Cross Attention for Transformer
Figure 4 for Learning Hard Retrieval Cross Attention for Transformer
Viaarxiv icon

Transformer with Depth-Wise LSTM

Add code
Jul 13, 2020
Figure 1 for Transformer with Depth-Wise LSTM
Figure 2 for Transformer with Depth-Wise LSTM
Figure 3 for Transformer with Depth-Wise LSTM
Figure 4 for Transformer with Depth-Wise LSTM
Viaarxiv icon

Learning Source Phrase Representations for Neural Machine Translation

Add code
Jun 25, 2020
Figure 1 for Learning Source Phrase Representations for Neural Machine Translation
Figure 2 for Learning Source Phrase Representations for Neural Machine Translation
Figure 3 for Learning Source Phrase Representations for Neural Machine Translation
Figure 4 for Learning Source Phrase Representations for Neural Machine Translation
Viaarxiv icon

Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change

Add code
May 05, 2020
Figure 1 for Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Figure 2 for Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Figure 3 for Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Figure 4 for Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Viaarxiv icon

Analyzing Word Translation of Transformer Layers

Add code
Mar 21, 2020
Figure 1 for Analyzing Word Translation of Transformer Layers
Figure 2 for Analyzing Word Translation of Transformer Layers
Figure 3 for Analyzing Word Translation of Transformer Layers
Figure 4 for Analyzing Word Translation of Transformer Layers
Viaarxiv icon

Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization

Add code
Nov 08, 2019
Figure 1 for Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization
Figure 2 for Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization
Figure 3 for Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization
Figure 4 for Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization
Viaarxiv icon

UdS Submission for the WMT 19 Automatic Post-Editing Task

Add code
Aug 09, 2019
Figure 1 for UdS Submission for the WMT 19 Automatic Post-Editing Task
Figure 2 for UdS Submission for the WMT 19 Automatic Post-Editing Task
Figure 3 for UdS Submission for the WMT 19 Automatic Post-Editing Task
Figure 4 for UdS Submission for the WMT 19 Automatic Post-Editing Task
Viaarxiv icon

Neutron: An Implementation of the Transformer Translation Model and its Variants

Add code
Mar 18, 2019
Figure 1 for Neutron: An Implementation of the Transformer Translation Model and its Variants
Figure 2 for Neutron: An Implementation of the Transformer Translation Model and its Variants
Viaarxiv icon