Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Simone Paolo Ponzetto

DS-TOD: Efficient Domain Specialization for Task Oriented Dialog

Oct 15, 2021

Chia-Chien Hung, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš

Figure 1 for DS-TOD: Efficient Domain Specialization for Task Oriented Dialog

Figure 2 for DS-TOD: Efficient Domain Specialization for Task Oriented Dialog

Figure 3 for DS-TOD: Efficient Domain Specialization for Task Oriented Dialog

Figure 4 for DS-TOD: Efficient Domain Specialization for Task Oriented Dialog

Abstract:Recent work has shown that self-supervised dialog-specific pretraining on large conversational datasets yields substantial gains over traditional language modeling (LM) pretraining in downstream task-oriented dialog (TOD). These approaches, however, exploit general dialogic corpora (e.g., Reddit) and thus presumably fail to reliably embed domain-specific knowledge useful for concrete downstream TOD domains. In this work, we investigate the effects of domain specialization of pretrained language models (PLMs) for task-oriented dialog. Within our DS-TOD framework, we first automatically extract salient domain-specific terms, and then use them to construct DomainCC and DomainReddit -- resources that we leverage for domain-specific pretraining, based on (i) masked language modeling (MLM) and (ii) response selection (RS) objectives, respectively. We further propose a resource-efficient and modular domain specialization by means of domain adapters -- additional parameter-light layers in which we encode the domain knowledge. Our experiments with two prominent TOD tasks -- dialog state tracking (DST) and response retrieval (RR) -- encompassing five domains from the MultiWOZ TOD benchmark demonstrate the effectiveness of our domain specialization approach. Moreover, we show that the light-weight adapter-based specialization (1) performs comparably to full fine-tuning in single-domain setups and (2) is particularly suitable for multi-domain specialization, in which, besides advantageous computational footprint, it can offer better downstream performance.

Via

Access Paper or Ask Questions

Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases

Aug 13, 2021

Tobias Walter, Celina Kirschner, Steffen Eger, Goran Glavaš, Anne Lauscher, Simone Paolo Ponzetto

Figure 1 for Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases

Figure 2 for Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases

Figure 3 for Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases

Figure 4 for Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases

Abstract:We analyze bias in historical corpora as encoded in diachronic distributional semantic models by focusing on two specific forms of bias, namely a political (i.e., anti-communism) and racist (i.e., antisemitism) one. For this, we use a new corpus of German parliamentary proceedings, DeuPARL, spanning the period 1867--2020. We complement this analysis of historical biases in diachronic word embeddings with a novel measure of bias on the basis of term co-occurrences and graph-based label propagation. The results of our bias measurements align with commonly perceived historical trends of antisemitic and anti-communist biases in German politics in different time periods, thus indicating the viability of analyzing historical bias trends using semantic spaces induced from historical corpora.

* Accepted for JCDL2021

Via

Access Paper or Ask Questions

Large-scale Taxonomy Induction Using Entity and Word Embeddings

May 04, 2021

Petar Ristoski, Stefano Faralli, Simone Paolo Ponzetto, Heiko Paulheim

Figure 1 for Large-scale Taxonomy Induction Using Entity and Word Embeddings

Figure 2 for Large-scale Taxonomy Induction Using Entity and Word Embeddings

Figure 3 for Large-scale Taxonomy Induction Using Entity and Word Embeddings

Figure 4 for Large-scale Taxonomy Induction Using Entity and Word Embeddings

Abstract:Taxonomies are an important ingredient of knowledge organization, and serve as a backbone for more sophisticated knowledge representations in intelligent systems, such as formal ontologies. However, building taxonomies manually is a costly endeavor, and hence, automatic methods for taxonomy induction are a good alternative to build large-scale taxonomies. In this paper, we propose TIEmb, an approach for automatic unsupervised class subsumption axiom extraction from knowledge bases using entity and text embeddings. We apply the approach on the WebIsA database, a database of subsumption relations extracted from the large portion of the World Wide Web, to extract class hierarchies in the Person and Place domain.

* Published at IEEE/WIC/ACM International Conference on Web Intelligence 2017 (WI'17)

Via

Access Paper or Ask Questions

DebIE: A Platform for Implicit and Explicit Debiasing of Word Embedding Spaces

Mar 11, 2021

Niklas Friedrich, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš

Figure 1 for DebIE: A Platform for Implicit and Explicit Debiasing of Word Embedding Spaces

Figure 2 for DebIE: A Platform for Implicit and Explicit Debiasing of Word Embedding Spaces

Figure 3 for DebIE: A Platform for Implicit and Explicit Debiasing of Word Embedding Spaces

Abstract:Recent research efforts in NLP have demonstrated that distributional word vector spaces often encode stereotypical human biases, such as racism and sexism. With word representations ubiquitously used in NLP models and pipelines, this raises ethical issues and jeopardizes the fairness of language technologies. While there exists a large body of work on bias measures and debiasing methods, to date, there is no platform that would unify these research efforts and make bias measuring and debiasing of representation spaces widely accessible. In this work, we present DebIE, the first integrated platform for (1) measuring and (2) mitigating bias in word embeddings. Given an (i) embedding space (users can choose between the predefined spaces or upload their own) and (ii) a bias specification (users can choose between existing bias specifications or create their own), DebIE can (1) compute several measures of implicit and explicit bias and modify the embedding space by executing two (mutually composable) debiasing models. DebIE's functionality can be accessed through four different interfaces: (a) a web application, (b) a desktop application, (c) a REST-ful API, and (d) as a command-line application. DebIE is available at: debie.informatik.uni-mannheim.de.

* Accepted as EACL21 Demo

Via

Access Paper or Ask Questions

FakeFlow: Fake News Detection by Modeling the Flow of Affective Information

Jan 24, 2021

Bilal Ghanem, Simone Paolo Ponzetto, Paolo Rosso, Francisco Rangel

Figure 1 for FakeFlow: Fake News Detection by Modeling the Flow of Affective Information

Figure 2 for FakeFlow: Fake News Detection by Modeling the Flow of Affective Information

Figure 3 for FakeFlow: Fake News Detection by Modeling the Flow of Affective Information

Figure 4 for FakeFlow: Fake News Detection by Modeling the Flow of Affective Information

Abstract:Fake news articles often stir the readers' attention by means of emotional appeals that arouse their feelings. Unlike in short news texts, authors of longer articles can exploit such affective factors to manipulate readers by adding exaggerations or fabricating events, in order to affect the readers' emotions. To capture this, we propose in this paper to model the flow of affective information in fake news articles using a neural architecture. The proposed model, FakeFlow, learns this flow by combining topic and affective information extracted from text. We evaluate the model's performance with several experiments on four real-world datasets. The results show that FakeFlow achieves superior results when compared against state-of-the-art methods, thus confirming the importance of capturing the flow of the affective information in news articles.

* 9 pages, 6 figures, EACL-2021

Via

Access Paper or Ask Questions

Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval

Jan 21, 2021

Robert Litschko, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš

Figure 1 for Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval

Figure 2 for Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval

Figure 3 for Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval

Figure 4 for Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval

Abstract:Pretrained multilingual text encoders based on neural Transformer architectures, such as multilingual BERT (mBERT) and XLM, have achieved strong performance on a myriad of language understanding tasks. Consequently, they have been adopted as a go-to paradigm for multilingual and cross-lingual representation learning and transfer, rendering cross-lingual word embeddings (CLWEs) effectively obsolete. However, questions remain to which extent this finding generalizes 1) to unsupervised settings and 2) for ad-hoc cross-lingual IR (CLIR) tasks. Therefore, in this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a large number of language pairs. In contrast to supervised language understanding, our results indicate that for unsupervised document-level CLIR -- a setup with no relevance judgments for IR-specific fine-tuning -- pretrained encoders fail to significantly outperform models based on CLWEs. For sentence-level CLIR, we demonstrate that state-of-the-art performance can be achieved. However, the peak performance is not met using the general-purpose multilingual text encoders `off-the-shelf', but rather relying on their variants that have been further specialized for sentence understanding tasks.

* accepted at ECIR'21 (preprint)

Via

Access Paper or Ask Questions

Self-Supervised Learning for Visual Summary Identification in Scientific Publications

Jan 14, 2021

Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš, Shigeo Morishima

Figure 1 for Self-Supervised Learning for Visual Summary Identification in Scientific Publications

Figure 2 for Self-Supervised Learning for Visual Summary Identification in Scientific Publications

Figure 3 for Self-Supervised Learning for Visual Summary Identification in Scientific Publications

Figure 4 for Self-Supervised Learning for Visual Summary Identification in Scientific Publications

Abstract:Providing visual summaries of scientific publications can increase information access for readers and thereby help deal with the exponential growth in the number of scientific publications. Nonetheless, efforts in providing visual publication summaries have been few and far apart, primarily focusing on the biomedical domain. This is primarily because of the limited availability of annotated gold standards, which hampers the application of robust and high-performing supervised learning techniques. To address these problems we create a new benchmark dataset for selecting figures to serve as visual summaries of publications based on their abstracts, covering several domains in computer science. Moreover, we develop a self-supervised learning approach, based on heuristic matching of inline references to figures with figure captions. Experiments in both biomedical and computer science domains show that our model is able to outperform the state of the art despite being self-supervised and therefore not relying on any annotated training data.

Via

Access Paper or Ask Questions

AraWEAT: Multidimensional Analysis of Biases in Arabic Word Embeddings

Nov 03, 2020

Anne Lauscher, Rafik Takieddin, Simone Paolo Ponzetto, Goran Glavaš

Figure 1 for AraWEAT: Multidimensional Analysis of Biases in Arabic Word Embeddings

Figure 2 for AraWEAT: Multidimensional Analysis of Biases in Arabic Word Embeddings

Figure 3 for AraWEAT: Multidimensional Analysis of Biases in Arabic Word Embeddings

Figure 4 for AraWEAT: Multidimensional Analysis of Biases in Arabic Word Embeddings

Abstract:Recent work has shown that distributional word vector spaces often encode human biases like sexism or racism. In this work, we conduct an extensive analysis of biases in Arabic word embeddings by applying a range of recently introduced bias tests on a variety of embedding spaces induced from corpora in Arabic. We measure the presence of biases across several dimensions, namely: embedding models (Skip-Gram, CBOW, and FastText) and vector sizes, types of text (encyclopedic text, and news vs. user-generated content), dialects (Egyptian Arabic vs. Modern Standard Arabic), and time (diachronic analyses over corpora from different time periods). Our analysis yields several interesting findings, e.g., that implicit gender bias in embeddings trained on Arabic news corpora steadily increases over time (between 2007 and 2017). We make the Arabic bias specifications (AraWEAT) publicly available.

* accepted for WANLP 20

Via

Access Paper or Ask Questions

Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Mar 14, 2020

Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, Simone Paolo Ponzetto, Alexander Panchenko

Figure 1 for Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Figure 2 for Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Figure 3 for Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Figure 4 for Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Abstract:Disambiguation of word senses in context is easy for humans, but is a major challenge for automatic approaches. Sophisticated supervised and knowledge-based models were developed to solve this task. However, (i) the inherent Zipfian distribution of supervised training instances for a given word and/or (ii) the quality of linguistic knowledge representations motivate the development of completely unsupervised and knowledge-free approaches to word sense disambiguation (WSD). They are particularly useful for under-resourced languages which do not have any resources for building either supervised and/or knowledge-based models. In this paper, we present a method that takes as input a standard pre-trained word embedding model and induces a fully-fledged word sense inventory, which can be used for disambiguation in context. We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings by Grave et al. (2018), enabling WSD in these languages. Models and system are available online.

* 10 pages, 5 figures, 4 tables, accepted at LREC 2020

Via

Access Paper or Ask Questions

FacTweet: Profiling Fake News Twitter Accounts

Oct 15, 2019

Bilal Ghanem, Simone Paolo Ponzetto, Paolo Rosso

Figure 1 for FacTweet: Profiling Fake News Twitter Accounts

Figure 2 for FacTweet: Profiling Fake News Twitter Accounts

Figure 3 for FacTweet: Profiling Fake News Twitter Accounts

Figure 4 for FacTweet: Profiling Fake News Twitter Accounts

Abstract:We present an approach to detect fake news in Twitter at the account level using a neural recurrent model and a variety of different semantic and stylistic features. Our method extracts a set of features from the timelines of news Twitter accounts by reading their posts as chunks, rather than dealing with each tweet independently. We show the experimental benefits of modeling latent stylistic signatures of mixed fake and real news with a sequential model over a wide range of strong baselines.

* 6 pages

Via

Access Paper or Ask Questions