Alert button
Picture for Raphaël Troncy

Raphaël Troncy

Alert button

Searching News Articles Using an Event Knowledge Graph Leveraged by Wikidata

Apr 11, 2019
Charlotte Rudnik, Thibault Ehrhart, Olivier Ferret, Denis Teyssou, Raphaël Troncy, Xavier Tannier

Figure 1 for Searching News Articles Using an Event Knowledge Graph Leveraged by Wikidata
Figure 2 for Searching News Articles Using an Event Knowledge Graph Leveraged by Wikidata
Figure 3 for Searching News Articles Using an Event Knowledge Graph Leveraged by Wikidata
Figure 4 for Searching News Articles Using an Event Knowledge Graph Leveraged by Wikidata

News agencies produce thousands of multimedia stories describing events happening in the world that are either scheduled such as sports competitions, political summits and elections, or breaking events such as military conflicts, terrorist attacks, natural disasters, etc. When writing up those stories, journalists refer to contextual background and to compare with past similar events. However, searching for precise facts described in stories is hard. In this paper, we propose a general method that leverages the Wikidata knowledge base to produce semantic annotations of news articles. Next, we describe a semantic search engine that supports both keyword based search in news articles and structured data search providing filters for properties belonging to specific event schemas that are automatically inferred.

* WikiWorkshop at The Web Conference 2019  
Viaarxiv icon

Analysis of Named Entity Recognition and Linking for Tweets

Oct 27, 2014
Leon Derczynski, Diana Maynard, Giuseppe Rizzo, Marieke van Erp, Genevieve Gorrell, Raphaël Troncy, Johann Petrak, Kalina Bontcheva

Figure 1 for Analysis of Named Entity Recognition and Linking for Tweets
Figure 2 for Analysis of Named Entity Recognition and Linking for Tweets
Figure 3 for Analysis of Named Entity Recognition and Linking for Tweets
Figure 4 for Analysis of Named Entity Recognition and Linking for Tweets

Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets is typically performed in a pipeline, comprising consecutive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). In this work, we describe a new Twitter entity disambiguation dataset, and conduct an empirical analysis of named entity recognition and disambiguation, investigating how robust a number of state-of-the-art systems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art.

* Information Processing & Management 51 (2), 32-49, 2014  
* 35 pages, accepted to journal Information Processing and Management 
Viaarxiv icon