Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Martin Potthast

Shammie

If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level

Apr 15, 2024

Matti Wiegmann, Jennifer Rakete, Magdalena Wolska, Benno Stein, Martin Potthast

Figure 1 for If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level

Figure 2 for If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level

Figure 3 for If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level

Figure 4 for If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level

Abstract:Trigger warnings are labels that preface documents with sensitive content if this content could be perceived as harmful by certain groups of readers. Since warnings about a document intuitively need to be shown before reading it, authors usually assign trigger warnings at the document level. What parts of their writing prompted them to assign a warning, however, remains unclear. We investigate for the first time the feasibility of identifying the triggering passages of a document, both manually and computationally. We create a dataset of 4,135 English passages, each annotated with one of eight common trigger warnings. In a large-scale evaluation, we then systematically evaluate the effectiveness of fine-tuned and few-shot classifiers, and their generalizability. We find that trigger annotation belongs to the group of subjective annotation tasks in NLP, and that automatic trigger classification remains challenging but feasible.

Via

Access Paper or Ask Questions

Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders

Apr 11, 2024

Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin Potthast, Matthias Hagen

Abstract:Cross-encoders are effective passage re-rankers. But when re-ranking multiple passages at once, existing cross-encoders inefficiently optimize the output ranking over several input permutations, as their passage interactions are not permutation-invariant. Moreover, their high memory footprint constrains the number of passages during listwise training. To tackle these issues, we propose the Set-Encoder, a new cross-encoder architecture that (1) introduces inter-passage attention with parallel passage processing to ensure permutation invariance between input passages, and that (2) uses fused-attention kernels to enable training with more passages at a time. In experiments on TREC Deep Learning and TIREx, the Set-Encoder is more effective than previous cross-encoders with a similar number of parameters. Compared to larger models, the Set-Encoder is more efficient and either on par or even more effective.

Via

Access Paper or Ask Questions

Task-Oriented Paraphrase Analytics

Mar 26, 2024

Marcel Gohsen, Matthias Hagen, Martin Potthast, Benno Stein

Figure 1 for Task-Oriented Paraphrase Analytics

Figure 2 for Task-Oriented Paraphrase Analytics

Figure 3 for Task-Oriented Paraphrase Analytics

Figure 4 for Task-Oriented Paraphrase Analytics

Abstract:Since paraphrasing is an ill-defined task, the term "paraphrasing" covers text transformation tasks with different characteristics. Consequently, existing paraphrasing studies have applied quite different (explicit and implicit) criteria as to when a pair of texts is to be considered a paraphrase, all of which amount to postulating a certain level of semantic or lexical similarity. In this paper, we conduct a literature review and propose a taxonomy to organize the 25~identified paraphrasing (sub-)tasks. Using classifiers trained to identify the tasks that a given paraphrasing instance fits, we find that the distributions of task-specific instances in the known paraphrase corpora vary substantially. This means that the use of these corpora, without the respective paraphrase conditions being clearly defined (which is the normal case), must lead to incomparable and misleading results.

* Accepted at LREC-COLING 2024

Via

Access Paper or Ask Questions

Analyzing Adversarial Attacks on Sequence-to-Sequence Relevance Models

Mar 12, 2024

Andrew Parry, Maik Fröbe, Sean MacAvaney, Martin Potthast, Matthias Hagen

Abstract:Modern sequence-to-sequence relevance models like monoT5 can effectively capture complex textual interactions between queries and documents through cross-encoding. However, the use of natural language tokens in prompts, such as Query, Document, and Relevant for monoT5, opens an attack vector for malicious documents to manipulate their relevance score through prompt injection, e.g., by adding target words such as true. Since such possibilities have not yet been considered in retrieval evaluation, we analyze the impact of query-independent prompt injection via manually constructed templates and LLM-based rewriting of documents on several existing relevance models. Our experiments on the TREC Deep Learning track show that adversarial documents can easily manipulate different sequence-to-sequence relevance models, while BM25 (as a typical lexical model) is not affected. Remarkably, the attacks also affect encoder-only relevance models (which do not rely on natural language prompt tokens), albeit to a lesser extent.

* 13 pages, 3 figures, Accepted at ECIR 2024 as a Full Paper

Via

Access Paper or Ask Questions

TL;DR Progress: Multi-faceted Literature Exploration in Text Summarization

Feb 10, 2024

Shahbaz Syed, Khalid Al-Khatib, Martin Potthast

Abstract:This paper presents TL;DR Progress, a new tool for exploring the literature on neural text summarization. It organizes 514~papers based on a comprehensive annotation scheme for text summarization approaches and enables fine-grained, faceted search. Each paper was manually annotated to capture aspects such as evaluation metrics, quality dimensions, learning paradigms, challenges addressed, datasets, and document domains. In addition, a succinct indicative summary is provided for each paper, consisting of automatically extracted contextual factors, issues, and proposed solutions. The tool is available online at https://www.tldr-progress.de, a demo video at https://youtu.be/uCVRGFvXUj8

* EACL 2024 System Demonstration

Via

Access Paper or Ask Questions

Detecting Generated Native Ads in Conversational Search

Feb 07, 2024

Sebastian Schmidt, Ines Zelch, Janek Bevendorff, Benno Stein, Matthias Hagen, Martin Potthast

Figure 1 for Detecting Generated Native Ads in Conversational Search

Abstract:Conversational search engines such as YouChat and Microsoft Copilot use large language models (LLMs) to generate answers to queries. It is only a small step to also use this technology to generate and integrate advertising within these answers - instead of placing ads separately from the organic search results. This type of advertising is reminiscent of native advertising and product placement, both of which are very effective forms of subtle and manipulative advertising. It is likely that information seekers will be confronted with such use of LLM technology in the near future, especially when considering the high computational costs associated with LLMs, for which providers need to develop sustainable business models. This paper investigates whether LLMs can also be used as a countermeasure against generated native ads, i.e., to block them. For this purpose we compile a large dataset of ad-prone queries and of generated answers with automatically integrated ads to experiment with fine-tuned sentence transformers and state-of-the-art LLMs on the task of recognizing the ads. In our experiments sentence transformers achieve detection precision and recall values above 0.9, while the investigated LLMs struggle with the task.

* Submitted to WWW'24 Short Papers Track; 4 pages

Via

Access Paper or Ask Questions

Zero-shot Generative Large Language Models for Systematic Review Screening Automation

Feb 01, 2024

Shuai Wang, Harrisen Scells, Shengyao Zhuang, Martin Potthast, Bevan Koopman, Guido Zuccon

Abstract:Systematic reviews are crucial for evidence-based medicine as they comprehensively analyse published research findings on specific questions. Conducting such reviews is often resource- and time-intensive, especially in the screening phase, where abstracts of publications are assessed for inclusion in a review. This study investigates the effectiveness of using zero-shot large language models~(LLMs) for automatic screening. We evaluate the effectiveness of eight different LLMs and investigate a calibration technique that uses a predefined recall threshold to determine whether a publication should be included in a systematic review. Our comprehensive evaluation using five standard test collections shows that instruction fine-tuning plays an important role in screening, that calibration renders LLMs practical for achieving a targeted recall, and that combining both with an ensemble of zero-shot models saves significant screening time compared to state-of-the-art approaches.

* Accepted to ECIR2024 full paper (findings)

Via

Access Paper or Ask Questions

Citance-Contextualized Summarization of Scientific Papers

Nov 13, 2023

Shahbaz Syed, Ahmad Dawar Hakimi, Khalid Al-Khatib, Martin Potthast

Figure 1 for Citance-Contextualized Summarization of Scientific Papers

Figure 2 for Citance-Contextualized Summarization of Scientific Papers

Figure 3 for Citance-Contextualized Summarization of Scientific Papers

Figure 4 for Citance-Contextualized Summarization of Scientific Papers

Abstract:Current approaches to automatic summarization of scientific papers generate informative summaries in the form of abstracts. However, abstracts are not intended to show the relationship between a paper and the references cited in it. We propose a new contextualized summarization approach that can generate an informative summary conditioned on a given sentence containing the citation of a reference (a so-called "citance"). This summary outlines the content of the cited paper relevant to the citation location. Thus, our approach extracts and models the citances of a paper, retrieves relevant passages from cited papers, and generates abstractive summaries tailored to each citance. We evaluate our approach using $\textbf{Webis-Context-SciSumm-2023}$, a new dataset containing 540K~computer science papers and 4.6M~citances therein.

* Accepted at EMNLP 2023 Findings

Via

Access Paper or Ask Questions

Evaluating Generative Ad Hoc Information Retrieval

Nov 08, 2023

Lukas Gienapp, Harrisen Scells, Niklas Deckers, Janek Bevendorff, Shuai Wang, Johannes Kiesel, Shahbaz Syed, Maik Fröbe, Guide Zucoon, Benno Stein(+2 more)

Figure 1 for Evaluating Generative Ad Hoc Information Retrieval

Figure 2 for Evaluating Generative Ad Hoc Information Retrieval

Figure 3 for Evaluating Generative Ad Hoc Information Retrieval

Figure 4 for Evaluating Generative Ad Hoc Information Retrieval

Abstract:Recent advances in large language models have enabled the development of viable generative information retrieval systems. A generative retrieval system returns a grounded generated text in response to an information need instead of the traditional document ranking. Quantifying the utility of these types of responses is essential for evaluating generative retrieval systems. As the established evaluation methodology for ranking-based ad hoc retrieval may seem unsuitable for generative retrieval, new approaches for reliable, repeatable, and reproducible experimentation are required. In this paper, we survey the relevant information retrieval and natural language processing literature, identify search tasks and system architectures in generative retrieval, develop a corresponding user model, and study its operationalization. This theoretical analysis provides a foundation and new insights for the evaluation of generative ad hoc retrieval systems.

* 14 pages, 5 figures, 1 table

Via

Access Paper or Ask Questions

Indicative Summarization of Long Discussions

Nov 03, 2023

Shahbaz Syed, Dominik Schwabe, Khalid Al-Khatib, Martin Potthast

Figure 1 for Indicative Summarization of Long Discussions

Figure 2 for Indicative Summarization of Long Discussions

Figure 3 for Indicative Summarization of Long Discussions

Figure 4 for Indicative Summarization of Long Discussions

Abstract:Online forums encourage the exchange and discussion of different stances on many topics. Not only do they provide an opportunity to present one's own arguments, but may also gather a broad cross-section of others' arguments. However, the resulting long discussions are difficult to overview. This paper presents a novel unsupervised approach using large language models (LLMs) to generating indicative summaries for long discussions that basically serve as tables of contents. Our approach first clusters argument sentences, generates cluster labels as abstractive summaries, and classifies the generated cluster labels into argumentation frames resulting in a two-level summary. Based on an extensively optimized prompt engineering approach, we evaluate 19~LLMs for generative cluster labeling and frame classification. To evaluate the usefulness of our indicative summaries, we conduct a purpose-driven user study via a new visual interface called Discussion Explorer: It shows that our proposed indicative summaries serve as a convenient navigation tool to explore long discussions.

* Accepted at EMNLP 2023 Main Conference

Via

Access Paper or Ask Questions