Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kalina Bontcheva

University of Sheffield

Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus

Jun 05, 2020

Xingyi Song, Johann Petrak, Ye Jiang, Iknoor Singh, Diana Maynard, Kalina Bontcheva

Figure 1 for Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus

Figure 2 for Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus

Figure 3 for Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus

Figure 4 for Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus

Abstract:The explosion of disinformation related to the COVID-19 pandemic has overloaded fact-checkers and media worldwide. To help tackle this, we developed computational methods to support COVID-19 disinformation debunking and social impacts research. This paper presents: 1) the currently largest available manually annotated COVID-19 disinformation category dataset; and 2) a classification-aware neural topic model (CANTM) that combines classification and topic modelling under a variational autoencoder framework. We demonstrate that CANTM efficiently improves classification performance with low resources, and is scalable. In addition, the classification-aware topics help researchers and end-users to better understand the classification results.

Via

Access Paper or Ask Questions

Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

Apr 17, 2020

Georg Rehm, Dimitrios Galanis, Penny Labropoulou, Stelios Piperidis, Martin Welß, Ricardo Usbeck, Joachim Köhler, Miltos Deligiannis, Katerina Gkirtzou, Johannes Fischer(+14 more)

Figure 1 for Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

Figure 2 for Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

Figure 3 for Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

Figure 4 for Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

Abstract:With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery of resources and services; (2) composition of cross-platform service workflows. We devise five different levels (of increasing complexity) of platform interoperability that we suggest to implement in a wider federation of AI/LT platforms. We illustrate the approach using the five emerging AI/LT platforms AI4EU, ELG, Lynx, QURATOR and SPEAKER.

* Proceedings of the 1st International Workshop on Language Technology Platforms (IWLTP 2020). To appear

Via

Access Paper or Ask Questions

The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Mar 30, 2020

Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz(+37 more)

Figure 1 for The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Figure 2 for The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Abstract:Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitude of approaches and technologies tailored to Europe's specific needs, there is still an immense level of fragmentation. At the same time, AI has become an increasingly important concept in the European Information and Communication Technology area. For a few years now, AI, including many opportunities, synergies but also misconceptions, has been overshadowing every other topic. We present an overview of the European LT landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to LT, including the current state of play in industry and the LT market. We present a brief overview of the main LT-related activities on the EU level in the last ten years and develop strategic guidance with regard to four key dimensions.

* Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). To appear

Via

Access Paper or Ask Questions

European Language Grid: An Overview

Mar 30, 2020

Georg Rehm, Maria Berger, Ela Elsholz, Stefanie Hegele, Florian Kintzel, Katrin Marheinecke, Stelios Piperidis, Miltos Deligiannis, Dimitris Galanis, Katerina Gkirtzou(+26 more)

Figure 1 for European Language Grid: An Overview

Figure 2 for European Language Grid: An Overview

Figure 3 for European Language Grid: An Overview

Figure 4 for European Language Grid: An Overview

Abstract:With 24 official EU and many additional languages, multilingualism in Europe and an inclusive Digital Single Market can only be enabled through Language Technologies (LTs). European LT business is dominated by hundreds of SMEs and a few large players. Many are world-class, with technologies that outperform the global players. However, European LT business is also fragmented, by nation states, languages, verticals and sectors, significantly holding back its impact. The European Language Grid (ELG) project addresses this fragmentation by establishing the ELG as the primary platform for LT in Europe. The ELG is a scalable cloud platform, providing, in an easy-to-integrate way, access to hundreds of commercial and non-commercial LTs for all European languages, including running tools and services as well as data sets and resources. Once fully operational, it will enable the commercial and non-commercial European LT community to deposit and upload their technologies and data sets into the ELG, to deploy them through the grid, and to connect with other resources. The ELG will boost the Multilingual Digital Single Market towards a thriving European LT community, creating new jobs and opportunities. Furthermore, the ELG project organises two open calls for up to 20 pilot projects. It also sets up 32 National Competence Centres (NCCs) and the European LT Council (LTC) for outreach and coordination purposes.

* Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). To appear

Via

Access Paper or Ask Questions

The evolution of argumentation mining: From models to social media and emerging tools

Jul 04, 2019

Anastasios Lytos, Thomas Lagkas, Panagiotis Sarigiannidis, Kalina Bontcheva

Figure 1 for The evolution of argumentation mining: From models to social media and emerging tools

Figure 2 for The evolution of argumentation mining: From models to social media and emerging tools

Figure 3 for The evolution of argumentation mining: From models to social media and emerging tools

Figure 4 for The evolution of argumentation mining: From models to social media and emerging tools

Abstract:Argumentation mining is a rising subject in the computational linguistics domain focusing on extracting structured arguments from natural text, often from unstructured or noisy text. The initial approaches on modeling arguments was aiming to identify a flawless argument on specific fields (Law, Scientific Papers) serving specific needs (completeness, effectiveness). With the emerge of Web 2.0 and the explosion in the use of social media both the diffusion of the data and the argument structure have changed. In this survey article, we bridge the gap between theoretical approaches of argumentation mining and pragmatic schemes that satisfy the needs of social media generated data, recognizing the need for adapting more flexible and expandable schemes, capable to adjust to the argumentation conditions that exist in social media. We review, compare, and classify existing approaches, techniques and tools, identifying the positive outcome of combining tasks and features, and eventually propose a conceptual architecture framework. The proposed theoretical framework is an argumentation mining scheme able to identify the distinct sub-tasks and capture the needs of social media text, revealing the need for adopting more flexible and extensible frameworks.

* Information Processing & Management, Volume 56, Issue 6, 2019
* Journal of Information Processing & Management, Elsevier - Accepted Version

Via

Access Paper or Ask Questions

RumourEval 2019: Determining Rumour Veracity and Support for Rumours

Sep 18, 2018

Genevieve Gorrell, Kalina Bontcheva, Leon Derczynski, Elena Kochkina, Maria Liakata, Arkaitz Zubiaga

Figure 1 for RumourEval 2019: Determining Rumour Veracity and Support for Rumours

Figure 2 for RumourEval 2019: Determining Rumour Veracity and Support for Rumours

Figure 3 for RumourEval 2019: Determining Rumour Veracity and Support for Rumours

Figure 4 for RumourEval 2019: Determining Rumour Veracity and Support for Rumours

Abstract:This is the proposal for RumourEval-2019, which will run in early 2019 as part of that year's SemEval event. Since the first RumourEval shared task in 2017, interest in automated claim validation has greatly increased, as the dangers of "fake news" have become a mainstream concern. Yet automated support for rumour checking remains in its infancy. For this reason, it is important that a shared task in this area continues to provide a focus for effort, which is likely to increase. We therefore propose a continuation in which the veracity of further rumours is determined, and as previously, supportive of this goal, tweets discussing them are classified according to the stance they take regarding the rumour. Scope is extended compared with the first RumourEval, in that the dataset is substantially expanded to include Reddit as well as Twitter data, and additional languages are also included.

Via

Access Paper or Ask Questions

Detection and Resolution of Rumours in Social Media: A Survey

Apr 03, 2018

Arkaitz Zubiaga, Ahmet Aker, Kalina Bontcheva, Maria Liakata, Rob Procter

Figure 1 for Detection and Resolution of Rumours in Social Media: A Survey

Abstract:Despite the increasing use of social media platforms for information and news gathering, its unmoderated nature often leads to the emergence and spread of rumours, i.e. pieces of information that are unverified at the time of posting. At the same time, the openness of social media platforms provides opportunities to study how users share and discuss rumours, and to explore how natural language processing and data mining techniques may be used to find ways of determining their veracity. In this survey we introduce and discuss two types of rumours that circulate on social media; long-standing rumours that circulate for long periods of time, and newly-emerging rumours spawned during fast-paced events such as breaking news, where reports are released piecemeal and often with an unverified status in their early stages. We provide an overview of research into social media rumours with the ultimate goal of developing a rumour classification system that consists of four components: rumour detection, rumour tracking, rumour stance classification and rumour veracity classification. We delve into the approaches presented in the scientific literature for the development of each of these four components. We summarise the efforts and achievements so far towards the development of rumour classification systems and conclude with suggestions for avenues for future research in social media mining for detection and resolution of rumours.

* ACM Computing Surveys 51, 2, Article 32 (February 2018), 36 pages
* ACM Computing Surveys

Via

Access Paper or Ask Questions

Helping Crisis Responders Find the Informative Needle in the Tweet Haystack

Jan 29, 2018

Leon Derczynski, Kenny Meesters, Kalina Bontcheva, Diana Maynard

Figure 1 for Helping Crisis Responders Find the Informative Needle in the Tweet Haystack

Figure 2 for Helping Crisis Responders Find the Informative Needle in the Tweet Haystack

Figure 3 for Helping Crisis Responders Find the Informative Needle in the Tweet Haystack

Figure 4 for Helping Crisis Responders Find the Informative Needle in the Tweet Haystack

Abstract:Crisis responders are increasingly using social media, data and other digital sources of information to build a situational understanding of a crisis situation in order to design an effective response. However with the increased availability of such data, the challenge of identifying relevant information from it also increases. This paper presents a successful automatic approach to handling this problem. Messages are filtered for informativeness based on a definition of the concept drawn from prior research and crisis response experts. Informative messages are tagged for actionable data -- for example, people in need, threats to rescue efforts, changes in environment, and so on. In all, eight categories of actionability are identified. The two components -- informativeness and actionability classification -- are packaged together as an openly-available tool called Emina (Emergent Informativeness and Actionability).

Via

Access Paper or Ask Questions

Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers

Dec 06, 2017

Arkaitz Zubiaga, Elena Kochkina, Maria Liakata, Rob Procter, Michal Lukasik, Kalina Bontcheva, Trevor Cohn, Isabelle Augenstein

Figure 1 for Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers

Figure 2 for Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers

Figure 3 for Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers

Figure 4 for Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers

Abstract:Rumour stance classification, defined as classifying the stance of specific social media posts into one of supporting, denying, querying or commenting on an earlier post, is becoming of increasing interest to researchers. While most previous work has focused on using individual tweets as classifier inputs, here we report on the performance of sequential classifiers that exploit the discourse features inherent in social media interactions or 'conversational threads'. Testing the effectiveness of four sequential classifiers -- Hawkes Processes, Linear-Chain Conditional Random Fields (Linear CRF), Tree-Structured Conditional Random Fields (Tree CRF) and Long Short Term Memory networks (LSTM) -- on eight datasets associated with breaking news stories, and looking at different types of local and contextual features, our work sheds new light on the development of accurate stance classifiers. We show that sequential classifiers that exploit the use of discourse properties in social media conversations while using only local features, outperform non-sequential classifiers. Furthermore, we show that LSTM using a reduced set of features can outperform the other sequential classifiers; this performance is consistent across datasets and across types of stances. To conclude, our work also analyses the different features under study, identifying those that best help characterise and distinguish between stances, such as supporting tweets being more likely to be accompanied by evidence than denying tweets. We also set forth a number of directions for future research.

* Information Processing & Management, Volume 54, Issue 2, March 2018, Pages 273-290

Via

Access Paper or Ask Questions

Simple Open Stance Classification for Rumour Analysis

Sep 14, 2017

Ahmet Aker, Leon Derczynski, Kalina Bontcheva

Figure 1 for Simple Open Stance Classification for Rumour Analysis

Figure 2 for Simple Open Stance Classification for Rumour Analysis

Figure 3 for Simple Open Stance Classification for Rumour Analysis

Figure 4 for Simple Open Stance Classification for Rumour Analysis

Abstract:Stance classification determines the attitude, or stance, in a (typically short) text. The task has powerful applications, such as the detection of fake news or the automatic extraction of attitudes toward entities or events in the media. This paper describes a surprisingly simple and efficient classification approach to open stance classification in Twitter, for rumour and veracity classification. The approach profits from a novel set of automatically identifiable problem-specific features, which significantly boost classifier accuracy and achieve above state-of-the-art results on recent benchmark datasets. This calls into question the value of using complex sophisticated models for stance classification without first doing informed feature extraction.

* In RANLP 2017

Via

Access Paper or Ask Questions