Alert button
Picture for Man Lan

Man Lan

Alert button

An Effective and Efficient Time-aware Entity Alignment Framework via Two-aspect Three-view Label Propagation

Jul 12, 2023
Li Cai, Xin Mao, Youshao Xiao, Changxu Wu, Man Lan

Figure 1 for An Effective and Efficient Time-aware Entity Alignment Framework via Two-aspect Three-view Label Propagation
Figure 2 for An Effective and Efficient Time-aware Entity Alignment Framework via Two-aspect Three-view Label Propagation
Figure 3 for An Effective and Efficient Time-aware Entity Alignment Framework via Two-aspect Three-view Label Propagation
Figure 4 for An Effective and Efficient Time-aware Entity Alignment Framework via Two-aspect Three-view Label Propagation

Entity alignment (EA) aims to find the equivalent entity pairs between different knowledge graphs (KGs), which is crucial to promote knowledge fusion. With the wide use of temporal knowledge graphs (TKGs), time-aware EA (TEA) methods appear to enhance EA. Existing TEA models are based on Graph Neural Networks (GNN) and achieve state-of-the-art (SOTA) performance, but it is difficult to transfer them to large-scale TKGs due to the scalability issue of GNN. In this paper, we propose an effective and efficient non-neural EA framework between TKGs, namely LightTEA, which consists of four essential components: (1) Two-aspect Three-view Label Propagation, (2) Sparse Similarity with Temporal Constraints, (3) Sinkhorn Operator, and (4) Temporal Iterative Learning. All of these modules work together to improve the performance of EA while reducing the time consumption of the model. Extensive experiments on public datasets indicate that our proposed model significantly outperforms the SOTA methods for EA between TKGs, and the time consumed by LightTEA is only dozens of seconds at most, no more than 10% of the most efficient TEA method.

* Accepted by IJCAI 2023 
Viaarxiv icon

LightEA: A Scalable, Robust, and Interpretable Entity Alignment Framework via Three-view Label Propagation

Oct 20, 2022
Xin Mao, Wenting Wang, Yuanbin Wu, Man Lan

Figure 1 for LightEA: A Scalable, Robust, and Interpretable Entity Alignment Framework via Three-view Label Propagation
Figure 2 for LightEA: A Scalable, Robust, and Interpretable Entity Alignment Framework via Three-view Label Propagation
Figure 3 for LightEA: A Scalable, Robust, and Interpretable Entity Alignment Framework via Three-view Label Propagation
Figure 4 for LightEA: A Scalable, Robust, and Interpretable Entity Alignment Framework via Three-view Label Propagation

Entity Alignment (EA) aims to find equivalent entity pairs between KGs, which is the core step of bridging and integrating multi-source KGs. In this paper, we argue that existing GNN-based EA methods inherit the inborn defects from their neural network lineage: weak scalability and poor interpretability. Inspired by recent studies, we reinvent the Label Propagation algorithm to effectively run on KGs and propose a non-neural EA framework -- LightEA, consisting of three efficient components: (i) Random Orthogonal Label Generation, (ii) Three-view Label Propagation, and (iii) Sparse Sinkhorn Iteration. According to the extensive experiments on public datasets, LightEA has impressive scalability, robustness, and interpretability. With a mere tenth of time consumption, LightEA achieves comparable results to state-of-the-art methods across all datasets and even surpasses them on many.

* 15 pages; Accepted by EMNLP2022 (Main Conf) 
Viaarxiv icon

Prompt-based Connective Prediction Method for Fine-grained Implicit Discourse Relation Recognition

Oct 16, 2022
Hao Zhou, Man Lan, Yuanbin Wu, Yuefeng Chen, Meirong Ma

Figure 1 for Prompt-based Connective Prediction Method for Fine-grained Implicit Discourse Relation Recognition
Figure 2 for Prompt-based Connective Prediction Method for Fine-grained Implicit Discourse Relation Recognition
Figure 3 for Prompt-based Connective Prediction Method for Fine-grained Implicit Discourse Relation Recognition
Figure 4 for Prompt-based Connective Prediction Method for Fine-grained Implicit Discourse Relation Recognition

Due to the absence of connectives, implicit discourse relation recognition (IDRR) is still a challenging and crucial task in discourse analysis. Most of the current work adopted multi-task learning to aid IDRR through explicit discourse relation recognition (EDRR) or utilized dependencies between discourse relation labels to constrain model predictions. But these methods still performed poorly on fine-grained IDRR and even utterly misidentified on most of the few-shot discourse relation classes. To address these problems, we propose a novel Prompt-based Connective Prediction (PCP) method for IDRR. Our method instructs large-scale pre-trained models to use knowledge relevant to discourse relation and utilizes the strong correlation between connectives and discourse relation to help the model recognize implicit discourse relations. Experimental results show that our method surpasses the current state-of-the-art model and achieves significant improvements on those fine-grained few-shot discourse relation. Moreover, our approach is able to be transferred to EDRR and obtain acceptable results. Our code is released in https://github.com/zh-i9/PCP-for-IDRR.

* Findings of EMNLP 2022 Accepted 
Viaarxiv icon

A Simple Temporal Information Matching Mechanism for Entity Alignment Between Temporal Knowledge Graphs

Sep 20, 2022
Li Cai, Xin Mao, Meirong Ma, Hao Yuan, Jianchao Zhu, Man Lan

Figure 1 for A Simple Temporal Information Matching Mechanism for Entity Alignment Between Temporal Knowledge Graphs
Figure 2 for A Simple Temporal Information Matching Mechanism for Entity Alignment Between Temporal Knowledge Graphs
Figure 3 for A Simple Temporal Information Matching Mechanism for Entity Alignment Between Temporal Knowledge Graphs
Figure 4 for A Simple Temporal Information Matching Mechanism for Entity Alignment Between Temporal Knowledge Graphs

Entity alignment (EA) aims to find entities in different knowledge graphs (KGs) that refer to the same object in the real world. Recent studies incorporate temporal information to augment the representations of KGs. The existing methods for EA between temporal KGs (TKGs) utilize a time-aware attention mechanism to incorporate relational and temporal information into entity embeddings. The approaches outperform the previous methods by using temporal information. However, we believe that it is not necessary to learn the embeddings of temporal information in KGs since most TKGs have uniform temporal representations. Therefore, we propose a simple graph neural network (GNN) model combined with a temporal information matching mechanism, which achieves better performance with less time and fewer parameters. Furthermore, since alignment seeds are difficult to label in real-world applications, we also propose a method to generate unsupervised alignment seeds via the temporal information of TKG. Extensive experiments on public datasets indicate that our supervised method significantly outperforms the previous methods and the unsupervised one has competitive performance.

* Accepted by COLING 2022 
Viaarxiv icon

Few Clean Instances Help Denoising Distant Supervision

Sep 14, 2022
Yufang Liu, Ziyin Huang, Yijun Wang, Changzhi Sun, Man Lan, Yuanbin Wu, Xiaofeng Mou, Ding Wang

Figure 1 for Few Clean Instances Help Denoising Distant Supervision
Figure 2 for Few Clean Instances Help Denoising Distant Supervision
Figure 3 for Few Clean Instances Help Denoising Distant Supervision
Figure 4 for Few Clean Instances Help Denoising Distant Supervision

Existing distantly supervised relation extractors usually rely on noisy data for both model training and evaluation, which may lead to garbage-in-garbage-out systems. To alleviate the problem, we study whether a small clean dataset could help improve the quality of distantly supervised models. We show that besides getting a more convincing evaluation of models, a small clean dataset also helps us to build more robust denoising models. Specifically, we propose a new criterion for clean instance selection based on influence functions. It collects sample-level evidence for recognizing good instances (which is more informative than loss-level evidence). We also propose a teacher-student mechanism for controlling purity of intermediate results when bootstrapping the clean set. The whole approach is model-agnostic and demonstrates strong performances on both denoising real (NYT) and synthetic noisy datasets.

* Accepted by COLING 2022 
Viaarxiv icon

A Dual-Attention Neural Network for Pun Location and Using Pun-Gloss Pairs for Interpretation

Oct 14, 2021
Shen Liu, Meirong Ma, Hao Yuan, Jianchao Zhu, Yuanbin Wu, Man Lan

Figure 1 for A Dual-Attention Neural Network for Pun Location and Using Pun-Gloss Pairs for Interpretation
Figure 2 for A Dual-Attention Neural Network for Pun Location and Using Pun-Gloss Pairs for Interpretation
Figure 3 for A Dual-Attention Neural Network for Pun Location and Using Pun-Gloss Pairs for Interpretation
Figure 4 for A Dual-Attention Neural Network for Pun Location and Using Pun-Gloss Pairs for Interpretation

Pun location is to identify the punning word (usually a word or a phrase that makes the text ambiguous) in a given short text, and pun interpretation is to find out two different meanings of the punning word. Most previous studies adopt limited word senses obtained by WSD(Word Sense Disambiguation) technique or pronunciation information in isolation to address pun location. For the task of pun interpretation, related work pays attention to various WSD algorithms. In this paper, a model called DANN (Dual-Attentive Neural Network) is proposed for pun location, effectively integrates word senses and pronunciation with context information to address two kinds of pun at the same time. Furthermore, we treat pun interpretation as a classification task and construct pungloss pairs as processing data to solve this task. Experiments on the two benchmark datasets show that our proposed methods achieve new state-of-the-art results. Our source code is available in the public code repository.

* NLPCC 2021  
Viaarxiv icon

From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment

Sep 15, 2021
Xin Mao, Wenting Wang, Yuanbin Wu, Man Lan

Figure 1 for From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment
Figure 2 for From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment
Figure 3 for From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment
Figure 4 for From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment

Cross-lingual entity alignment (EA) aims to find the equivalent entities between crosslingual KGs, which is a crucial step for integrating KGs. Recently, many GNN-based EA methods are proposed and show decent performance improvements on several public datasets. Meanwhile, existing GNN-based EA methods inevitably inherit poor interpretability and low efficiency from neural networks. Motivated by the isomorphic assumption of GNNbased methods, we successfully transform the cross-lingual EA problem into the assignment problem. Based on this finding, we propose a frustratingly Simple but Effective Unsupervised entity alignment method (SEU) without neural networks. Extensive experiments show that our proposed unsupervised method even beats advanced supervised methods across all public datasets and has high efficiency, interpretability, and stability.

* 11 pages; Accepted by EMNLP2021 (Main Conf) 
Viaarxiv icon

Are Negative Samples Necessary in Entity Alignment? An Approach with High Performance, Scalability and Robustness

Aug 12, 2021
Xin Mao, Wenting Wang, Yuanbin Wu, Man Lan

Figure 1 for Are Negative Samples Necessary in Entity Alignment? An Approach with High Performance, Scalability and Robustness
Figure 2 for Are Negative Samples Necessary in Entity Alignment? An Approach with High Performance, Scalability and Robustness
Figure 3 for Are Negative Samples Necessary in Entity Alignment? An Approach with High Performance, Scalability and Robustness
Figure 4 for Are Negative Samples Necessary in Entity Alignment? An Approach with High Performance, Scalability and Robustness

Entity alignment (EA) aims to find the equivalent entities in different KGs, which is a crucial step in integrating multiple KGs. However, most existing EA methods have poor scalability and are unable to cope with large-scale datasets. We summarize three issues leading to such high time-space complexity in existing EA methods: (1) Inefficient graph encoders, (2) Dilemma of negative sampling, and (3) "Catastrophic forgetting" in semi-supervised learning. To address these challenges, we propose a novel EA method with three new components to enable high Performance, high Scalability, and high Robustness (PSR): (1) Simplified graph encoder with relational graph sampling, (2) Symmetric negative-free alignment loss, and (3) Incremental semi-supervised learning. Furthermore, we conduct detailed experiments on several public datasets to examine the effectiveness and efficiency of our proposed method. The experimental results show that PSR not only surpasses the previous SOTA in performance but also has impressive scalability and robustness.

* 11 pages; Accepted by CIKM 2021 (Full) 
Viaarxiv icon

Boosting the Speed of Entity Alignment 10*: Dual Attention Matching Network with Normalized Hard Sample Mining

Mar 29, 2021
Xin Mao, Wenting Wang, Yuanbin Wu, Man Lan

Figure 1 for Boosting the Speed of Entity Alignment 10*: Dual Attention Matching Network with Normalized Hard Sample Mining
Figure 2 for Boosting the Speed of Entity Alignment 10*: Dual Attention Matching Network with Normalized Hard Sample Mining
Figure 3 for Boosting the Speed of Entity Alignment 10*: Dual Attention Matching Network with Normalized Hard Sample Mining
Figure 4 for Boosting the Speed of Entity Alignment 10*: Dual Attention Matching Network with Normalized Hard Sample Mining

Seeking the equivalent entities among multi-source Knowledge Graphs (KGs) is the pivotal step to KGs integration, also known as \emph{entity alignment} (EA). However, most existing EA methods are inefficient and poor in scalability. A recent summary points out that some of them even require several days to deal with a dataset containing 200,000 nodes (DWY100K). We believe over-complex graph encoder and inefficient negative sampling strategy are the two main reasons. In this paper, we propose a novel KG encoder -- Dual Attention Matching Network (Dual-AMN), which not only models both intra-graph and cross-graph information smartly, but also greatly reduces computational complexity. Furthermore, we propose the Normalized Hard Sample Mining Loss to smoothly select hard negative samples with reduced loss shift. The experimental results on widely used public datasets indicate that our method achieves both high accuracy and high efficiency. On DWY100K, the whole running process of our method could be finished in 1,100 seconds, at least 10* faster than previous work. The performances of our method also outperform previous works across all datasets, where Hits@1 and MRR have been improved from 6% to 13%.

* 12 pages; Accepted by TheWebConf(WWW) 2021 
Viaarxiv icon