Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vladimir Karpukhin

Aligned Cross Entropy for Non-Autoregressive Machine Translation

Apr 03, 2020

Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy

Figure 1 for Aligned Cross Entropy for Non-Autoregressive Machine Translation

Figure 2 for Aligned Cross Entropy for Non-Autoregressive Machine Translation

Figure 3 for Aligned Cross Entropy for Non-Autoregressive Machine Translation

Figure 4 for Aligned Cross Entropy for Non-Autoregressive Machine Translation

Abstract:Non-autoregressive machine translation models significantly speed up decoding by allowing for parallel prediction of the entire target sequence. However, modeling word order is more challenging due to the lack of autoregressive factors in the model. This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order. In this paper, we propose aligned cross entropy (AXE) as an alternative loss function for training of non-autoregressive models. AXE uses a differentiable dynamic program to assign loss based on the best possible monotonic alignment between target tokens and model predictions. AXE-based training of conditional masked language models (CMLMs) substantially improves performance on major WMT benchmarks, while setting a new state of the art for non-autoregressive models.

Via

Access Paper or Ask Questions

Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation

Feb 05, 2019

Vladimir Karpukhin, Omer Levy, Jacob Eisenstein, Marjan Ghazvininejad

Figure 1 for Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation

Figure 2 for Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation

Figure 3 for Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation

Figure 4 for Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation

Abstract:We consider the problem of making machine translation more robust to character-level variation at the source side, such as typos. Existing methods achieve greater coverage by applying subword models such as byte-pair encoding (BPE) and character-level encoders, but these methods are highly sensitive to spelling mistakes. We show how training on a mild amount of random synthetic noise can dramatically improve robustness to these variations, without diminishing performance on clean text. We focus on translation performance on natural noise, as captured by frequent corrections in Wikipedia edit logs, and show that robustness to such noise can be achieved using a balanced diet of simple synthetic noises at training time, without access to the natural noise data or distribution.

Via

Access Paper or Ask Questions