Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Katrina Olsen

Finding Sense in Nonsense with Generated Contexts: Perspectives from Humans and Language Models

Feb 13, 2026

Katrina Olsen, Sebastian Padó

Abstract:Nonsensical and anomalous sentences have been instrumental in the development of computational models of semantic interpretation. A core challenge is to distinguish between what is merely anomalous (but can be interpreted given a supporting context) and what is truly nonsensical. However, it is unclear (a) how nonsensical, rather than merely anomalous, existing datasets are; and (b) how well LLMs can make this distinction. In this paper, we answer both questions by collecting sensicality judgments from human raters and LLMs on sentences from five semantically deviant datasets: both context-free and when providing a context. We find that raters consider most sentences at most anomalous, and only a few as properly nonsensical. We also show that LLMs are substantially skilled in generating plausible contexts for anomalous cases.

Via

Access Paper or Ask Questions

Gender, names and other mysteries: Towards the ambiguous for gender-inclusive translation

Jun 07, 2023

Danielle Saunders, Katrina Olsen

Abstract:The vast majority of work on gender in MT focuses on 'unambiguous' inputs, where gender markers in the source language are expected to be resolved in the output. Conversely, this paper explores the widespread case where the source sentence lacks explicit gender markers, but the target sentence contains them due to richer grammatical gender. We particularly focus on inputs containing person names. Investigating such sentence pairs casts a new light on research into MT gender bias and its mitigation. We find that many name-gender co-occurrences in MT data are not resolvable with 'unambiguous gender' in the source language, and that gender-ambiguous examples can make up a large proportion of training examples. From this, we discuss potential steps toward gender-inclusive translation which accepts the ambiguity in both gender and translation.

* GITT workshop at EAMT 2023

Via

Access Paper or Ask Questions

AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature

Feb 13, 2023

Melissa Roemmele, Kyle Shaffer, Katrina Olsen, Yiyi Wang, Steve DeNeefe

Figure 1 for AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature

Figure 2 for AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature

Figure 3 for AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature

Figure 4 for AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature

Abstract:Creating an abridged version of a text involves shortening it while maintaining its linguistic qualities. In this paper, we examine this task from an NLP perspective for the first time. We present a new resource, AbLit, which is derived from abridged versions of English literature books. The dataset captures passage-level alignments between the original and abridged texts. We characterize the linguistic relations of these alignments, and create automated models to predict these relations as well as to generate abridgements for new texts. Our findings establish abridgement as a challenging task, motivating future resources and research. The dataset is available at github.com/roemmele/AbLit.

* Accepted at EACL 2023

Via

Access Paper or Ask Questions