Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:BlonD: An Automatic Evaluation Metric for Document-level MachineTranslation

Mar 22, 2021

Yuchen Jiang, Shuming Ma, Dongdong Zhang, Jian Yang, Haoyang Huang, Ming Zhou

Figure 1 for BlonD: An Automatic Evaluation Metric for Document-level MachineTranslation

Figure 2 for BlonD: An Automatic Evaluation Metric for Document-level MachineTranslation

Figure 3 for BlonD: An Automatic Evaluation Metric for Document-level MachineTranslation

Figure 4 for BlonD: An Automatic Evaluation Metric for Document-level MachineTranslation

Share this with someone who'll enjoy it:

Abstract:Standard automatic metrics (such as BLEU) are problematic for document-level MT evaluation. They can neither distinguish document-level improvements in translation quality from sentence-level ones nor can they identify the specific discourse phenomena that caused the translation errors. To address these problems, we propose an automatic metric BlonD for document-level machine translation evaluation. BlonD takes discourse coherence into consideration by calculating the recall and distance of check-pointing phrases and tags, and further provides comprehensive evaluation scores by combining with n-gram. Extensive comparisons between BlonD and existing evaluation metrics are conducted to illustrate their critical distinctions. Experimental results show that BlonD has a much higher document-level sensitivity with respect to previous metrics. The human evaluation also reveals high Pearson R correlation values between BlonD scores and manual quality judgments.

* 8 pages

View paper on

Share this with someone who'll enjoy it:

Title:BlonD: An Automatic Evaluation Metric for Document-level MachineTranslation

Paper and Code