Performing link prediction using knowledge graph embedding (KGE) models is a popular approach for knowledge graph completion. Such link predictions are performed by measuring the likelihood of links in the graph via a transformation function that maps nodes via edges into a vector space. Since the complex structure of the real world is reflected in multi-relational knowledge graphs, the transformation functions need to be able to represent this complexity. However, most of the existing transformation functions in embedding models have been designed in Euclidean geometry and only cover one or two simple transformations. Therefore, they are prone to underfitting and limited in their ability to embed complex graph structures. The area of projective geometry, however, fully covers inversion, reflection, translation, rotation, and homothety transformations. We propose a novel KGE model, which supports those transformations and subsumes other state-of-the-art models. The model has several favorable theoretical properties and outperforms existing approaches on widely used link prediction benchmarks.
Industry~4.0 (I4.0) standards and standardization frameworks have been proposed with the goal of \emph{empowering interoperability} in smart factories. These standards enable the description and interaction of the main components, systems, and processes inside of a smart factory. Due to the growing number of frameworks and standards, there is an increasing need for approaches that automatically analyze the landscape of I4.0 standards. Standardization frameworks classify standards according to their functions into layers and dimensions. However, similar standards can be classified differently across the frameworks, producing, thus, interoperability conflicts among them. Semantic-based approaches that rely on ontologies and knowledge graphs, have been proposed to represent standards, known relations among them, as well as their classification according to existing frameworks. Albeit informative, the structured modeling of the I4.0 landscape only provides the foundations for detecting interoperability issues. Thus, graph-based analytical methods able to exploit knowledge encoded by these approaches, are required to uncover alignments among standards. We study the relatedness among standards and frameworks based on community analysis to discover knowledge that helps to cope with interoperability conflicts between standards. We use knowledge graph embeddings to automatically create these communities exploiting the meaning of the existing relationships. In particular, we focus on the identification of similar standards, i.e., communities of standards, and analyze their properties to detect unknown relations. We empirically evaluate our approach on a knowledge graph of I4.0 standards using the Trans$^*$ family of embedding models for knowledge graph entities. Our results are promising and suggest that relations among standards can be detected accurately.
Entity linking - connecting entity mentions in a natural language utterance to knowledge graph (KG) entities is a crucial step for question answering over KGs. It is often based on measuring the string similarity between the entity label and its mention in the question. The relation referred to in the question can help to disambiguate between entities with the same label. This can be misleading if an incorrect relation has been identified in the relation linking step. However, an incorrect relation may still be semantically similar to the relation in which the correct entity forms a triple within the KG; which could be captured by the similarity of their KG embeddings. Based on this idea, we propose the first end-to-end neural network approach that employs KG as well as word embeddings to perform joint relation and entity classification of simple questions while implicitly performing entity disambiguation with the help of a novel gating mechanism. An empirical evaluation shows that the proposed approach achieves a performance comparable to state-of-the-art entity linking while requiring less post-processing.
Attention-based encoder-decoder neural network models have recently shown promising results in goal-oriented dialogue systems. However, these models struggle to reason over and incorporate state-full knowledge while preserving their end-to-end text generation functionality. Since such models can greatly benefit from user intent and knowledge graph integration, in this paper we propose an RNN-based end-to-end encoder-decoder architecture which is trained with joint embeddings of the knowledge graph and the corpus as input. The model provides an additional integration of user intent along with text generation, trained with a multi-task learning paradigm along with an additional regularization technique to penalize generating the wrong entity as output. The model further incorporates a Knowledge Graph entity lookup during inference to guarantee the generated output is state-full based on the local knowledge graph provided. We finally evaluated the model using the BLEU score, empirical evaluation depicts that our proposed architecture can aid in the betterment of task-oriented dialogue system`s performance.
The Entity Linking (EL) approaches have been a long-standing research field and find applicability in various use cases such as semantic search, text annotation, question answering, etc. Although effective and robust, current approaches are still limited to particular knowledge repositories (e.g. Wikipedia) or specific knowledge graphs (e.g. Freebase, DBpedia, and YAGO). The collaborative knowledge graphs such as Wikidata excessively rely on the crowd to author the information. Since the crowd is not bound to a standard protocol for assigning entity titles, the knowledge graph is populated by non-standard, noisy, long or even sometimes awkward titles. The issue of long, implicit, and nonstandard entity representations is a challenge in EL approaches for gaining high precision and recall. In this paper, we advance the state-of-the-art approaches by developing a context-aware attentive neural network approach for entity linking on Wikidata. Our approach contributes by exploiting the sufficient context from a Knowledge Graph as a source of background knowledge, which is then fed into the neural network. This approach demonstrates merit to address challenges associated with entity titles (multi-word, long, implicit, case-sensitive). Our experimental study shows $\approx$8\% improvements over the baseline approach, and significantly outperform an end to end approach for Wikidata entity linking. This work, first of its kind, opens a new direction for the research community to pay attention to developing context-aware EL approaches for collaborative knowledge graphs.
We propose an approach to predict the natural gas price in several days using historical price data and events extracted from news headlines. Most previous methods treats price as an extrapolatable time series, those analyze the relation between prices and news either trim their price data correspondingly to a public news dataset, manually annotate headlines or use off-the-shelf tools. In comparison to off-the-shelf tools, our event extraction method detects not only the occurrence of phenomena but also the changes in attribution and characteristics from public sources. Instead of using sentence embedding as a feature, we use every word of the extracted events, encode and organize them before feeding to the learning models. Empirical results show favorable results, in terms of prediction performance, money saved and scalability.
Resource Description Framework (RDF) triplestores and Property Graph (PG) database systems are two approaches for data management that are based on modeling, storing and querying graph-like data. Given the heterogeneity between these systems, it becomes necessary to develop methods to allow interoperability among them. While there exist some approaches to exchange data and schema between RDF and PG databases, they lack compatibility and even a solid formal foundation. In this paper, we study the semantic interoperability between RDF and PG databases. Specifically, we present two direct mappings (schema-dependent and schema-independent) for transforming an RDF database into a PG database. We show that the proposed mappings possess the fundamental properties of semantics preservation and information preservation. The existence of both mappings allows us to conclude that the PG data model subsumes the expressiveness or information capacity of the RDF data model.
Knowledge Graph (KG) embedding has attracted more attention in recent years. Most of KG embedding models learn from time-unaware triples. However, the inclusion of temporal information beside triples would further improve the performance of a KGE model. In this regard, we propose ATiSE, a temporal KG embedding model which incorporates time information into entity/relation representations by using Additive Time Series decomposition. Moreover, considering the temporal uncertainty during the evolution of entity/relation representations over time, we map the representations of temporal KGs into the space of multi-dimensional Gaussian distributions. The mean of each entity/relation embedding at a time step shows the current expected position, whereas its covariance (which is temporally stationary) represents its temporal uncertainty. Experimental results show that ATiSE not only achieves the state-of-the-art on link prediction over temporal KGs, but also can predict the occurrence time of facts with missing time annotations, as well as the existence of future events. To the best of our knowledge, no other model is capable to perform all these tasks.
Non-goal oriented, generative dialogue systems lack the ability to generate answers with grounded facts. A knowledge graph can be considered an abstraction of the real world consisting of well-grounded facts. This paper addresses the problem of generating well grounded responses by integrating knowledge graphs into the dialogue systems response generation process, in an end-to-end manner. A dataset for nongoal oriented dialogues is proposed in this paper in the domain of soccer, conversing on different clubs and national teams along with a knowledge graph for each of these teams. A novel neural network architecture is also proposed as a baseline on this dataset, which can integrate knowledge graphs into the response generation process, producing well articulated, knowledge grounded responses. Empirical evidence suggests that the proposed model performs better than other state-of-the-art models for knowledge graph integrated dialogue systems.
Knowledge graphs (KGs) represent world's facts in structured forms. KG completion exploits the existing facts in a KG to discover new ones. Translation-based embedding model (TransE) is a prominent formulation to do KG completion. Despite the efficiency of TransE in memory and time, it suffers from several limitations in encoding relation patterns such as symmetric, reflexive etc. To resolve this problem, most of the attempts have circled around the revision of the score function of TransE i.e., proposing a more complicated score function such as Trans(A, D, G, H, R, etc) to mitigate the limitations. In this paper, we tackle this problem from a different perspective. We show that existing theories corresponding to the limitations of TransE are inaccurate because they ignore the effect of loss function. Accordingly, we pose theoretical investigations of the main limitations of TransE in the light of loss function. To the best of our knowledge, this has not been investigated so far comprehensively. We show that by a proper selection of the loss function for training the TransE model, the main limitations of the model are mitigated. This is explained by setting upper-bound for the scores of positive samples, showing the region of truth (i.e., the region that a triple is considered positive by the model). Our theoretical proofs with experimental results fill the gap between the capability of translation-based class of embedding models and the loss function. The theories emphasise the importance of the selection of the loss functions for training the models. Our experimental evaluations on different loss functions used for training the models justify our theoretical proofs and confirm the importance of the loss functions on the performance.