The related work section is an important component of a scientific paper, which highlights the contribution of the target paper in the context of the reference papers. Authors can save their time and effort by using the automatically generated related work section as a draft to complete the final related work. Most of the existing related work section generation methods rely on extracting off-the-shelf sentences to make a comparative discussion about the target work and the reference papers. However, such sentences need to be written in advance and are hard to obtain in practice. Hence, in this paper, we propose an abstractive target-aware related work generator (TAG), which can generate related work sections consisting of new sentences. Concretely, we first propose a target-aware graph encoder, which models the relationships between reference papers and the target paper with target-centered attention mechanisms. In the decoding process, we propose a hierarchical decoder that attends to the nodes of different levels in the graph with keyphrases as semantic indicators. Finally, to generate a more informative related work, we propose multi-level contrastive optimization objectives, which aim to maximize the mutual information between the generated related work with the references and minimize that with non-references. Extensive experiments on two public scholar datasets show that the proposed model brings substantial improvements over several strong baselines in terms of automatic and tailored human evaluations.
Merge trees are a type of topological descriptors that record the connectivity among the sublevel sets of scalar fields. In this paper, we are interested in sketching a set of merge trees. That is, given a set T of merge trees, we would like to find a basis set S such that each tree in T can be approximately reconstructed from a linear combination of merge trees in S. A set of high-dimensional vectors can be sketched via matrix sketching techniques such as principal component analysis and column subset selection. However, up until now, topological descriptors such as merge trees have not been known to be sketchable. We develop a framework for sketching a set of merge trees that combines the Gromov-Wasserstein framework of Chowdhury and Needham with techniques from matrix sketching. We demonstrate the applications of our framework in sketching merge trees that arise from data ensembles in scientific simulations.
Eye-catching headlines function as the first device to trigger more clicks, bringing reciprocal effect between producers and viewers. Producers can obtain more traffic and profits, and readers can have access to outstanding articles. When generating attractive headlines, it is important to not only capture the attractive content but also follow an eye-catching written style. In this paper, we propose a Disentanglement-based Attractive Headline Generator (DAHG) that generates headline which captures the attractive content following the attractive style. Concretely, we first devise a disentanglement module to divide the style and content of an attractive prototype headline into latent spaces, with two auxiliary constraints to ensure the two spaces are indeed disentangled. The latent content information is then used to further polish the document representation and help capture the salient part. Finally, the generator takes the polished document as input to generate headline under the guidance of the attractive style. Extensive experiments on the public Kuaibao dataset show that DAHG achieves state-of-the-art performance. Human evaluation also demonstrates that DAHG triggers 22% more clicks than existing models.
A popular multimedia news format nowadays is providing users with a lively video and a corresponding news article, which is employed by influential news media including CNN, BBC, and social media including Twitter and Weibo. In such a case, automatically choosing a proper cover frame of the video and generating an appropriate textual summary of the article can help editors save time, and readers make the decision more effectively. Hence, in this paper, we propose the task of Video-based Multimodal Summarization with Multimodal Output (VMSMO) to tackle such a problem. The main challenge in this task is to jointly model the temporal dependency of video with semantic meaning of article. To this end, we propose a Dual-Interaction-based Multimodal Summarizer (DIMS), consisting of a dual interaction module and multimodal generator. In the dual interaction module, we propose a conditional self-attention mechanism that captures local semantic information within video and a global-attention mechanism that handles the semantic relationship between news text and video from a high level. Extensive experiments conducted on a large-scale real-world VMSMO dataset show that DIMS achieves the state-of-the-art performance in terms of both automatic metrics and human evaluations.