Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thorben Schomacker

Exploring Automatic Text Simplification of German Narrative Documents

Dec 15, 2023

Thorben Schomacker, Tillmann Dönicke, Marina Tropmann-Frick

Abstract:In this paper, we apply transformer-based Natural Language Generation (NLG) techniques to the problem of text simplification. Currently, there are only a few German datasets available for text simplification, even fewer with larger and aligned documents, and not a single one with narrative texts. In this paper, we explore to which degree modern NLG techniques can be applied to German narrative text simplifications. We use Longformer attention and a pre-trained mBART model. Our findings indicate that the existing approaches for German are not able to solve the task properly. We conclude on a few directions for future research to address this problem.

Via

Access Paper or Ask Questions

Data and Approaches for German Text simplification -- towards an Accessibility-enhanced Communication

Dec 15, 2023

Thorben Schomacker, Michael Gille, Jörg von der Hülls, Marina Tropmann-Frick

Figure 1 for Data and Approaches for German Text simplification -- towards an Accessibility-enhanced Communication

Abstract:This paper examines the current state-of-the-art of German text simplification, focusing on parallel and monolingual German corpora. It reviews neural language models for simplifying German texts and assesses their suitability for legal texts and accessibility requirements. Our findings highlight the need for additional training data and more appropriate approaches that consider the specific linguistic characteristics of German, as well as the importance of the needs and preferences of target groups with cognitive or language impairments. The authors launched the interdisciplinary OPEN-LS project in April 2023 to address these research gaps. The project aims to develop a framework for text formats tailored to individuals with low literacy levels, integrate legal texts, and enhance comprehensibility for those with linguistic or cognitive impairments. It will also explore cost-effective ways to enhance the data with audience-specific illustrations using image-generating AI. For more and up-to-date information, please visit our project homepage https://open-ls.entavis.com

Via

Access Paper or Ask Questions