Alert button
Picture for Lukas Schmelzeisen

Lukas Schmelzeisen

Alert button

Wikidated 1.0: An Evolving Knowledge Graph Dataset of Wikidata's Revision History

Dec 09, 2021
Lukas Schmelzeisen, Corina Dima, Steffen Staab

Figure 1 for Wikidated 1.0: An Evolving Knowledge Graph Dataset of Wikidata's Revision History
Figure 2 for Wikidated 1.0: An Evolving Knowledge Graph Dataset of Wikidata's Revision History

Wikidata is the largest general-interest knowledge base that is openly available. It is collaboratively edited by thousands of volunteer editors and has thus evolved considerably since its inception in 2012. In this paper, we present Wikidated 1.0, a dataset of Wikidata's full revision history, which encodes changes between Wikidata revisions as sets of deletions and additions of RDF triples. To the best of our knowledge, it constitutes the first large dataset of an evolving knowledge graph, a recently emerging research subject in the Semantic Web community. We introduce the methodology for generating Wikidated 1.0 from dumps of Wikidata, discuss its implementation and limitations, and present statistical characteristics of the dataset.

* Wikidata@ISWC 2021  
* 15 pages, 4 figures. Published at Wikidata@ISWC 2021 
Viaarxiv icon

Knowledge Graphs

Mar 28, 2020
Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d'Amato, Gerard de Melo, Claudio Gutierrez, José Emilio Labra Gayo, Sabrina Kirrane, Sebastian Neumaier, Axel Polleres, Roberto Navigli, Axel-Cyrille Ngonga Ngomo, Sabbir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Steffen Staab, Antoine Zimmermann

Figure 1 for Knowledge Graphs
Figure 2 for Knowledge Graphs
Figure 3 for Knowledge Graphs
Figure 4 for Knowledge Graphs

In this paper we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After a general introduction, we motivate and contrast various graph-based data models and query languages that are used for knowledge graphs. We discuss the roles of schema, identity, and context in knowledge graphs. We explain how knowledge can be represented and extracted using a combination of deductive and inductive techniques. We summarise methods for the creation, enrichment, quality assessment, refinement, and publication of knowledge graphs. We provide an overview of prominent open knowledge graphs and enterprise knowledge graphs, their applications, and how they use the aforementioned techniques. We conclude with high-level future research directions for knowledge graphs.

* Revision from previous version: - Fixing flight companies in Figure 3 and changing some other details - Giving Figure 4 analogous data to Figure 3 for easier comparison - Updating discussion of the figures in Section 2.1.3. - Updating Example B.6 to reflect the new Figure 4. - Minor formatting change for Figure 27 
Viaarxiv icon

CLEARumor at SemEval-2019 Task 7: ConvoLving ELMo Against Rumors

Apr 05, 2019
Ipek Baris, Lukas Schmelzeisen, Steffen Staab

Figure 1 for CLEARumor at SemEval-2019 Task 7: ConvoLving ELMo Against Rumors
Figure 2 for CLEARumor at SemEval-2019 Task 7: ConvoLving ELMo Against Rumors
Figure 3 for CLEARumor at SemEval-2019 Task 7: ConvoLving ELMo Against Rumors
Figure 4 for CLEARumor at SemEval-2019 Task 7: ConvoLving ELMo Against Rumors

This paper describes our submission to SemEval-2019 Task 7: RumourEval: Determining Rumor Veracity and Support for Rumors. We participated in both subtasks. The goal of subtask A is to classify the type of interaction between a rumorous social media post and a reply post as support, query, deny, or comment. The goal of subtask B is to predict the veracity of a given rumor. For subtask A, we implement a CNN-based neural architecture using ELMo embeddings of post text combined with auxiliary features and achieve a F1-score of 44.6%. For subtask B, we employ a MLP neural network leveraging our estimates for subtask A and achieve a F1-score of 30.1% (second place in the competition). We provide results and analysis of our system performance and present ablation experiments.

* 5 pages, 2 figures, 3 tables. Accepted for publication at SemEval@NAACL-HLT 2019 
Viaarxiv icon

Learning Taxonomies of Concepts and not Words using Contextualized Word Representations: A Position Paper

Jan 31, 2019
Lukas Schmelzeisen, Steffen Staab

Figure 1 for Learning Taxonomies of Concepts and not Words using Contextualized Word Representations: A Position Paper

Taxonomies are semantic hierarchies of concepts. One limitation of current taxonomy learning systems is that they define concepts as single words. This position paper argues that contextualized word representations, which recently achieved state-of-the-art results on many competitive NLP tasks, are a promising method to address this limitation. We outline a novel approach for taxonomy learning that (1) defines concepts as synsets, (2) learns density-based approximations of contextualized word representations, and (3) can measure similarity and hypernymy among them.

* 5 pages, 1 figure 
Viaarxiv icon