Alert button
Picture for Marco Del Tredici

Marco Del Tredici

Alert button

From Rewriting to Remembering: Common Ground for Conversational QA Models

Apr 08, 2022
Marco Del Tredici, Xiaoyu Shen, Gianni Barlacchi, Bill Byrne, Adrià de Gispert

Figure 1 for From Rewriting to Remembering: Common Ground for Conversational QA Models
Figure 2 for From Rewriting to Remembering: Common Ground for Conversational QA Models
Figure 3 for From Rewriting to Remembering: Common Ground for Conversational QA Models
Figure 4 for From Rewriting to Remembering: Common Ground for Conversational QA Models

In conversational QA, models have to leverage information in previous turns to answer upcoming questions. Current approaches, such as Question Rewriting, struggle to extract relevant information as the conversation unwinds. We introduce the Common Ground (CG), an approach to accumulate conversational information as it emerges and select the relevant information at every turn. We show that CG offers a more efficient and human-like way to exploit conversational information compared to existing approaches, leading to improvements on Open Domain Conversational QA.

* Accepted at NLP for ConvAI 
Viaarxiv icon

Words are the Window to the Soul: Language-based User Representations for Fake News Detection

Nov 14, 2020
Marco Del Tredici, Raquel Fernández

Figure 1 for Words are the Window to the Soul: Language-based User Representations for Fake News Detection
Figure 2 for Words are the Window to the Soul: Language-based User Representations for Fake News Detection
Figure 3 for Words are the Window to the Soul: Language-based User Representations for Fake News Detection
Figure 4 for Words are the Window to the Soul: Language-based User Representations for Fake News Detection

Cognitive and social traits of individuals are reflected in language use. Moreover, individuals who are prone to spread fake news online often share common traits. Building on these ideas, we introduce a model that creates representations of individuals on social media based only on the language they produce, and use them to detect fake news. We show that language-based user representations are beneficial for this task. We also present an extended analysis of the language of fake news spreaders, showing that its main features are mostly domain independent and consistent across two English datasets. Finally, we exploit the relation between language use and connections in the social graph to assess the presence of the Echo Chamber effect in our data.

* 9 pages, accepted at COLING 2020 
Viaarxiv icon

Analysing Lexical Semantic Change with Contextualised Word Representations

Apr 29, 2020
Mario Giulianelli, Marco Del Tredici, Raquel Fernández

Figure 1 for Analysing Lexical Semantic Change with Contextualised Word Representations
Figure 2 for Analysing Lexical Semantic Change with Contextualised Word Representations
Figure 3 for Analysing Lexical Semantic Change with Contextualised Word Representations
Figure 4 for Analysing Lexical Semantic Change with Contextualised Word Representations

This paper presents the first unsupervised approach to lexical semantic change that makes use of contextualised word representations. We propose a novel method that exploits the BERT neural language model to obtain representations of word usages, clusters these representations into usage types, and measures change along time with three proposed metrics. We create a new evaluation dataset and show that the model representations and the detected semantic shifts are positively correlated with human judgements. Our extensive qualitative analysis demonstrates that our method captures a variety of synchronic and diachronic linguistic phenomena. We expect our work to inspire further research in this direction.

* To appear in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL-2020) 
Viaarxiv icon

You Shall Know a User by the Company It Keeps: Dynamic Representations for Social Media Users in NLP

Sep 01, 2019
Marco Del Tredici, Diego Marcheggiani, Sabine Schulte im Walde, Raquel Fernández

Figure 1 for You Shall Know a User by the Company It Keeps: Dynamic Representations for Social Media Users in NLP
Figure 2 for You Shall Know a User by the Company It Keeps: Dynamic Representations for Social Media Users in NLP
Figure 3 for You Shall Know a User by the Company It Keeps: Dynamic Representations for Social Media Users in NLP
Figure 4 for You Shall Know a User by the Company It Keeps: Dynamic Representations for Social Media Users in NLP

Information about individuals can help to better understand what they say, particularly in social media where texts are short. Current approaches to modelling social media users pay attention to their social connections, but exploit this information in a static way, treating all connections uniformly. This ignores the fact, well known in sociolinguistics, that an individual may be part of several communities which are not equally relevant in all communicative situations. We present a model based on Graph Attention Networks that captures this observation. It dynamically explores the social graph of a user, computes a user representation given the most relevant connections for a target task, and combines it with linguistic information to make a prediction. We apply our model to three different tasks, evaluate it against alternative models, and analyse the results extensively, showing that it significantly outperforms other current methods.

* To appear in Proceeding of EMNLP 2019 
Viaarxiv icon

Abusive Language Detection with Graph Convolutional Networks

Apr 05, 2019
Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, Ekaterina Shutova

Figure 1 for Abusive Language Detection with Graph Convolutional Networks
Figure 2 for Abusive Language Detection with Graph Convolutional Networks
Figure 3 for Abusive Language Detection with Graph Convolutional Networks

Abuse on the Internet represents a significant societal problem of our time. Previous research on automated abusive language detection in Twitter has shown that community-based profiling of users is a promising technique for this task. However, existing approaches only capture shallow properties of online communities by modeling follower-following relationships. In contrast, working with graph convolutional networks (GCNs), we present the first approach that captures not only the structure of online communities but also the linguistic behavior of the users within them. We show that such a heterogeneous graph-structured modeling of communities significantly advances the current state of the art in abusive language detection.

* Proceedings of the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT) 
Viaarxiv icon

Author Profiling for Hate Speech Detection

Feb 14, 2019
Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, Ekaterina Shutova

Figure 1 for Author Profiling for Hate Speech Detection
Figure 2 for Author Profiling for Hate Speech Detection
Figure 3 for Author Profiling for Hate Speech Detection
Figure 4 for Author Profiling for Hate Speech Detection

The rapid growth of social media in recent years has fed into some highly undesirable phenomena such as proliferation of abusive and offensive language on the Internet. Previous research suggests that such hateful content tends to come from users who share a set of common stereotypes and form communities around them. The current state-of-the-art approaches to hate speech detection are oblivious to user and community information and rely entirely on textual (i.e., lexical and semantic) cues. In this paper, we propose a novel approach to this problem that incorporates community-based profiling features of Twitter users. Experimenting with a dataset of 16k tweets, we show that our methods significantly outperform the current state of the art in hate speech detection. Further, we conduct a qualitative analysis of model characteristics. We release our code, pre-trained models and all the resources used in the public domain.

* Proceedings of the 27th International Conference on Computational Linguistics (COLING) 2018. arXiv admin note: text overlap with arXiv:1809.00378 
Viaarxiv icon

Short-term meaning shift: an exploratory distributional analysis

Sep 10, 2018
Marco Del Tredici, Raquel Fernández, Gemma Boleda

Figure 1 for Short-term meaning shift: an exploratory distributional analysis
Figure 2 for Short-term meaning shift: an exploratory distributional analysis
Figure 3 for Short-term meaning shift: an exploratory distributional analysis
Figure 4 for Short-term meaning shift: an exploratory distributional analysis

We investigate diachronic meaning shift that takes place in short periods of time (short-term meaning shift) and in an online community of speakers. We create a small dataset and use it to assess the performance of a standard model for meaning shift detection on short-term meaning shift, and find that this phenomenon poses specific difficulties for models based on the Distributional Hypothesis.

Viaarxiv icon

Semantic Variation in Online Communities of Practice

Jun 15, 2018
Marco Del Tredici, Raquel Fernández

Figure 1 for Semantic Variation in Online Communities of Practice
Figure 2 for Semantic Variation in Online Communities of Practice
Figure 3 for Semantic Variation in Online Communities of Practice
Figure 4 for Semantic Variation in Online Communities of Practice

We introduce a framework for quantifying semantic variation of common words in Communities of Practice and in sets of topic-related communities. We show that while some meaning shifts are shared across related communities, others are community-specific, and therefore independent from the discussed topic. We propose such findings as evidence in favour of sociolinguistic theories of socially-driven semantic variation. Results are evaluated using an independent language modelling task. Furthermore, we investigate extralinguistic features and show that factors such as prominence and dissemination of words are related to semantic variation.

* 13 pages, Proceedings of the 12th International Conference on Computational Semantics (IWCS 2017) 
Viaarxiv icon

The Road to Success: Assessing the Fate of Linguistic Innovations in Online Communities

Jun 15, 2018
Marco Del Tredici, Raquel Fernández

Figure 1 for The Road to Success: Assessing the Fate of Linguistic Innovations in Online Communities
Figure 2 for The Road to Success: Assessing the Fate of Linguistic Innovations in Online Communities
Figure 3 for The Road to Success: Assessing the Fate of Linguistic Innovations in Online Communities
Figure 4 for The Road to Success: Assessing the Fate of Linguistic Innovations in Online Communities

We investigate the birth and diffusion of lexical innovations in a large dataset of online social communities. We build on sociolinguistic theories and focus on the relation between the spread of a novel term and the social role of the individuals who use it, uncovering characteristics of innovators and adopters. Finally, we perform a prediction task that allows us to anticipate whether an innovation will successfully spread within a community.

* 13 pages, Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018) 
Viaarxiv icon

Tracing metaphors in time through self-distance in vector spaces

Nov 10, 2016
Marco Del Tredici, Malvina Nissim, Andrea Zaninello

Figure 1 for Tracing metaphors in time through self-distance in vector spaces
Figure 2 for Tracing metaphors in time through self-distance in vector spaces

From a diachronic corpus of Italian, we build consecutive vector spaces in time and use them to compare a term's cosine similarity to itself in different time spans. We assume that a drop in similarity might be related to the emergence of a metaphorical sense at a given time. Similarity-based observations are matched to the actual year when a figurative meaning was documented in a reference dictionary and through manual inspection of corpus occurrences.

* Proceedings of the Third Italian Conference on Computational Linguistics (CLIC 2016) 
Viaarxiv icon