Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sebastian Goodman

Multi-stage Pretraining for Abstractive Summarization

Sep 23, 2019

Sebastian Goodman, Zhenzhong Lan, Radu Soricut

Figure 1 for Multi-stage Pretraining for Abstractive Summarization

Figure 2 for Multi-stage Pretraining for Abstractive Summarization

Figure 3 for Multi-stage Pretraining for Abstractive Summarization

Figure 4 for Multi-stage Pretraining for Abstractive Summarization

Abstract:Neural models for abstractive summarization tend to achieve the best performance in the presence of highly specialized, summarization specific modeling add-ons such as pointer-generator, coverage-modeling, and inferencetime heuristics. We show here that pretraining can complement such modeling advancements to yield improved results in both short-form and long-form abstractive summarization using two key concepts: full-network initialization and multi-stage pretraining. Our method allows the model to transitively benefit from multiple pretraining tasks, from generic language tasks to a specialized summarization task to an even more specialized one such as bullet-based summarization. Using this approach, we demonstrate improvements of 1.05 ROUGE-L points on the Gigaword benchmark and 1.78 ROUGE-L points on the CNN/DailyMail benchmark, compared to a randomly-initialized baseline.

Via

Access Paper or Ask Questions

Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task

Dec 22, 2016

Nan Ding, Sebastian Goodman, Fei Sha, Radu Soricut

Figure 1 for Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task

Figure 2 for Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task

Figure 3 for Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task

Figure 4 for Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task

Abstract:We introduce a new multi-modal task for computer systems, posed as a combined vision-language comprehension challenge: identifying the most suitable text describing a scene, given several similar options. Accomplishing the task entails demonstrating comprehension beyond just recognizing "keywords" (or key-phrases) and their corresponding visual concepts. Instead, it requires an alignment between the representations of the two modalities that achieves a visually-grounded "understanding" of various linguistic elements and their dependencies. This new task also admits an easy-to-compute and well-studied metric: the accuracy in detecting the true target among the decoys. The paper makes several contributions: an effective and extensible mechanism for generating decoys from (human-created) image captions; an instance of applying this mechanism, yielding a large-scale machine comprehension dataset (based on the COCO images and captions) that we make publicly available; human evaluation results on this dataset, informing a performance upper-bound; and several baseline and competitive learning approaches that illustrate the utility of the proposed task and dataset in advancing both image and language comprehension. We also show that, in a multi-task learning setting, the performance on the proposed task is positively correlated with the end-to-end task of image captioning.

* 11 pages

Via

Access Paper or Ask Questions