Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Translationese as a Language in "Multilingual" NMT

Nov 10, 2019

Parker Riley, Isaac Caswell, Markus Freitag, David Grangier

Figure 1 for Translationese as a Language in "Multilingual" NMT

Figure 2 for Translationese as a Language in "Multilingual" NMT

Figure 3 for Translationese as a Language in "Multilingual" NMT

Figure 4 for Translationese as a Language in "Multilingual" NMT

Share this with someone who'll enjoy it:

Abstract:Machine translation has an undesirable propensity to produce "translationese" artifacts, which can lead to higher BLEU scores while being liked less by human raters. Motivated by this, we model translationese and original (i.e. natural) text as separate languages in a multilingual model, and pose the question: can we perform zero-shot translation between original source text and original target text? There is no data with original source and original target, so we train sentence-level classifiers to distinguish translationese from original target text, and use this classifier to tag the training data for an NMT model. Using this technique we bias the model to produce more natural outputs at test time, yielding gains in human evaluation scores on both accuracy and fluency. Additionally, we demonstrate that it is possible to bias the model to produce translationese and game the BLEU score, increasing it while decreasing human-rated quality. We analyze these models using metrics to measure the degree of translationese in the output, and present an analysis of the capriciousness of heuristically-based train-data tagging.

View paper on

Share this with someone who'll enjoy it:

Title:Translationese as a Language in "Multilingual" NMT

Paper and Code