Topic:Abstractive Text Summarization
What is Abstractive Text Summarization? Abstractive text summarization is the process of generating a concise summary of a text by paraphrasing and rephrasing the content.
Papers and Code
Jul 16, 2024
Abstract:Despite the advances in the abstractive summarization task using Large Language Models (LLM), there is a lack of research that asses their abilities to easily adapt to different domains. We evaluate the domain adaptation abilities of a wide range of LLMs on the summarization task across various domains in both fine-tuning and in-context learning settings. We also present AdaptEval, the first domain adaptation evaluation suite. AdaptEval includes a domain benchmark and a set of metrics to facilitate the analysis of domain adaptation. Our results demonstrate that LLMs exhibit comparable performance in the in-context learning setting, regardless of their parameter scale.
Via

Sep 18, 2024
Abstract:Keyphrase selection plays a pivotal role within the domain of scholarly texts, facilitating efficient information retrieval, summarization, and indexing. In this work, we explored how to apply fine-tuned generative transformer-based models to the specific task of keyphrase selection within Russian scientific texts. We experimented with four distinct generative models, such as ruT5, ruGPT, mT5, and mBART, and evaluated their performance in both in-domain and cross-domain settings. The experiments were conducted on the texts of Russian scientific abstracts from four domains: mathematics & computer science, history, medicine, and linguistics. The use of generative models, namely mBART, led to gains in in-domain performance (up to 4.9% in BERTScore, 9.0% in ROUGE-1, and 12.2% in F1-score) over three keyphrase extraction baselines for the Russian language. Although the results for cross-domain usage were significantly lower, they still demonstrated the capability to surpass baseline performances in several cases, underscoring the promising potential for further exploration and refinement in this research field.
* DAMDID-2024
Via

Jul 30, 2024
Abstract:Currently, large language models are gaining popularity, their achievements are used in many areas, ranging from text translation to generating answers to queries. However, the main problem with these new machine learning algorithms is that training such models requires large computing resources that only large IT companies have. To avoid this problem, a number of methods (LoRA, quantization) have been proposed so that existing models can be effectively fine-tuned for specific tasks. In this paper, we propose an E2E (end to end) audio summarization model using these techniques. In addition, this paper examines the effectiveness of these approaches to the problem under consideration and draws conclusions about the applicability of these methods.
* 36 pages, Master's thesis, 14 figures
Via

Aug 27, 2024
Abstract:The proliferation of video content on platforms like YouTube and Vimeo presents significant challenges in efficiently locating relevant information. Automatic video summarization aims to address this by extracting and presenting key content in a condensed form. This thesis explores enhancing video summarization by integrating text-based queries and conditional modeling to tailor summaries to user needs. Traditional methods often produce fixed summaries that may not align with individual requirements. To overcome this, we propose a multi-modal deep learning approach that incorporates both textual queries and visual information, fusing them at different levels of the model architecture. Evaluation metrics such as accuracy and F1-score assess the quality of the generated summaries. The thesis also investigates improving text-based query representations using contextualized word embeddings and specialized attention networks. This enhances the semantic understanding of queries, leading to better video summaries. To emulate human-like summarization, which accounts for both visual coherence and abstract factors like storyline consistency, we introduce a conditional modeling approach. This method uses multiple random variables and joint distributions to capture key summarization components, resulting in more human-like and explainable summaries. Addressing data scarcity in fully supervised learning, the thesis proposes a segment-level pseudo-labeling approach. This self-supervised method generates additional data, improving model performance even with limited human-labeled datasets. In summary, this research aims to enhance automatic video summarization by incorporating text-based queries, improving query representations, introducing conditional modeling, and addressing data scarcity, thereby creating more effective and personalized video summaries.
* Ph.D. thesis, 137 pages
Via

Jul 02, 2024
Abstract:Abstractive Speech Summarization (SSum) aims to generate human-like text summaries from spoken content. It encounters difficulties in handling long speech input and capturing the intricate cross-modal mapping between long speech inputs and short text summaries. Research on large language models (LLMs) and multimodal information fusion has provided new insights for addressing these challenges. In this paper, we propose an end-to-end SSum model that utilizes Q-Former as a connector for the audio-text modality and employs LLMs to generate text summaries directly from speech features. We adopt a multi-stage training approach that includes LLM based ASR and Text Summarization (TSum) tasks as auxiliary tasks. ASR tasks are used to align feature spaces and enhance the LLM's ability to handle longer speech. Then, we utilize a curriculum learning strategy to facilitate the model's transition from TSum to SSum. Finally, our model achieves competitive performance on the How-2 dataset.
* InterSpeech 2024
Via

Mar 24, 2024
Abstract:As human society transitions into the information age, reduction in our attention span is a contingency, and people who spend time reading lengthy news articles are decreasing rapidly and the need for succinct information is higher than ever before. Therefore, it is essential to provide a quick overview of important news by concisely summarizing the top news article and the most intuitive headline. When humans try to make summaries, they extract the essential information from the source and add useful phrases and grammatical annotations from the original extract. Humans have a unique ability to create abstractions. However, automatic summarization is a complicated problem to solve. The use of sequence-to-sequence (seq2seq) models for neural abstractive text summarization has been ascending as far as prevalence. Numerous innovative strategies have been proposed to develop the current seq2seq models further, permitting them to handle different issues like saliency, familiarity, and human lucidness and create excellent synopses. In this article, we aimed toward enhancing the present architectures and models for abstractive text summarization. The modifications have been aimed at fine-tuning hyper-parameters, attempting specific encoder-decoder combinations. We examined many experiments on an extensively used CNN/DailyMail dataset to check the effectiveness of various models.
Via

Jun 08, 2024
Abstract:Nowadays, neural text generation has made tremendous progress in abstractive summarization tasks. However, most of the existing summarization models take in the whole document all at once, which sometimes cannot meet the needs in practice. Practically, social text streams such as news events and tweets keep growing from time to time, and can only be fed to the summarization system step by step. Hence, in this paper, we propose the task of Stepwise Summarization, which aims to generate a new appended summary each time a new document is proposed. The appended summary should not only summarize the newly added content but also be coherent with the previous summary, to form an up-to-date complete summary. To tackle this challenge, we design an adversarial learning model, named Stepwise Summary Generator (SSG). First, SSG selectively processes the new document under the guidance of the previous summary, obtaining polished document representation. Next, SSG generates the summary considering both the previous summary and the document. Finally, a convolutional-based discriminator is employed to determine whether the newly generated summary is coherent with the previous summary. For the experiment, we extend the traditional two-step update summarization setting to a multi-step stepwise setting, and re-propose a large-scale stepwise summarization dataset based on a public story generation dataset. Extensive experiments on this dataset show that SSG achieves state-of-the-art performance in terms of both automatic metrics and human evaluations. Ablation studies demonstrate the effectiveness of each module in our framework. We also discuss the benefits and limitations of recent large language models on this task.
* 10 pages, 4 figures, published in TASLP
Via

Jul 04, 2024
Abstract:Nursing notes, an important component of Electronic Health Records (EHRs), keep track of the progression of a patient's health status during a care episode. Distilling the key information in nursing notes through text summarization techniques can improve clinicians' efficiency in understanding patients' conditions when reviewing nursing notes. However, existing abstractive summarization methods in the clinical setting have often overlooked nursing notes and require the creation of reference summaries for supervision signals, which is time-consuming. In this work, we introduce QGSumm, a query-guided self-supervised domain adaptation framework for nursing note summarization. Using patient-related clinical queries as guidance, our approach generates high-quality, patient-centered summaries without relying on reference summaries for training. Through automatic and manual evaluation by an expert clinician, we demonstrate the strengths of our approach compared to the state-of-the-art Large Language Models (LLMs) in both zero-shot and few-shot settings. Ultimately, our approach provides a new perspective on conditional text summarization, tailored to the specific interests of clinical personnel.
Via

Aug 29, 2024
Abstract:Reinforcement learning is used to align language models with human preference signals after first pre-training the model to predict the next token of text within a large corpus using likelihood maximization. Before being deployed in a specific domain, models are often further fine-tuned on task specific data. Since human preferences are often unavailable for the last step, it is performed using likelihood maximization as that is the typical default method. However, reinforcement learning has other advantages besides facilitating alignment to a human derived reward function. For one, whereas likelihood maximization is a form of imitation learning in which the model is trained on what to do under ideal conditions, reinforcement learning is not limited to demonstrating actions just for optimally reached states and trains a model what to do under a range of scenarios as it explores the policy space. In addition, it also trains a model what not to do, suppressing competitive but poor actions. This work develops a framework for last-mile fine-tuning using reinforcement learning and tests whether it garners performance gains. The experiments center on abstractive summarization, but the framework is general and broadly applicable. Use of the procedure produced significantly better results than likelihood maximization when comparing raw predictions. For the specific data tested, the gap could be bridged by employing post-processing of the maximum likelihood outputs. Nonetheless, the framework offers a new avenue for model optimization in situations where post-processing may be less straightforward or effective, and it can be extended to include more complex classes of undesirable outputs to penalize and train against, such as hallucinations.
Via

Apr 08, 2024
Abstract:Automatic text summarization (TS) plays a pivotal role in condensing large volumes of information into concise, coherent summaries, facilitating efficient information retrieval and comprehension. This paper presents a novel framework for abstractive TS of single documents, which integrates three dominant aspects: structural, semantic, and neural-based approaches. The proposed framework merges machine learning and knowledge-based techniques to achieve a unified methodology. The framework consists of three main phases: pre-processing, machine learning, and post-processing. In the pre-processing phase, a knowledge-based Word Sense Disambiguation (WSD) technique is employed to generalize ambiguous words, enhancing content generalization. Semantic content generalization is then performed to address out-of-vocabulary (OOV) or rare words, ensuring comprehensive coverage of the input document. Subsequently, the generalized text is transformed into a continuous vector space using neural language processing techniques. A deep sequence-to-sequence (seq2seq) model with an attention mechanism is employed to predict a generalized summary based on the vector representation. In the post-processing phase, heuristic algorithms and text similarity metrics are utilized to refine the generated summary further. Concepts from the generalized summary are matched with specific entities, enhancing coherence and readability. Experimental evaluations conducted on prominent datasets, including Gigaword, Duc 2004, and CNN/DailyMail, demonstrate the effectiveness of the proposed framework. Results indicate significant improvements in handling rare and OOV words, outperforming existing state-of-the-art deep learning techniques. The proposed framework presents a comprehensive and unified approach towards abstractive TS, combining the strengths of structure, semantics, and neural-based methodologies.
Via
