Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Layer-Wise Cross-View Decoding for Sequence-to-Sequence Learning

May 16, 2020
Fenglin Liu, Xuancheng Ren, Guangxiang Zhao, Xu Sun

Share this with someone who'll enjoy it:

In sequence-to-sequence learning, the attention mechanism has been a great success in bridging the information between the encoder and the decoder. However, it is often overlooked that the decoder only has a single view of the source sequences, that is, the representations generated by the last encoder layer, which is supposed to be a global view of source sequences. Such implementation hinders the decoder from concrete, fine-grained, local source information. In this work, we explore to reuse the representations from different encoder layers for layer-wise cross-view decoding, that is, different views of the source sequences are presented to different decoder layers. We investigate multiple, representative strategies for cross-view coding, of which the granularity consistent attention (GCA) strategy proves the most efficient and effective in the experiments on neural machine translation task. Especially, GCA surpasses the previous state-of-the-art architecture on three machine translation datasets.

* Achieve state-of-the-art BLEU scores on WMT14 EN-DE, EN-FR, and IWSLT DE-EN datasets 

   Access Paper Source

Share this with someone who'll enjoy it: