Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei Hu

State Key Laboratory for Novel Software Technology, Nanjing University

Improving Continual Relation Extraction by Distinguishing Analogous Semantics

May 11, 2023

Wenzheng Zhao, Yuanning Cui, Wei Hu

Abstract:Continual relation extraction (RE) aims to learn constantly emerging relations while avoiding forgetting the learned relations. Existing works store a small number of typical samples to re-train the model for alleviating forgetting. However, repeatedly replaying these samples may cause the overfitting problem. We conduct an empirical study on existing works and observe that their performance is severely affected by analogous relations. To address this issue, we propose a novel continual extraction model for analogous relations. Specifically, we design memory-insensitive relation prototypes and memory augmentation to overcome the overfitting problem. We also introduce integrated training and focal knowledge distillation to enhance the performance on analogous relations. Experimental results show the superiority of our model and demonstrate its effectiveness in distinguishing analogous relations and overcoming overfitting.

* Accepted in the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)

Via

Access Paper or Ask Questions

MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer

Apr 24, 2023

Qihao Zhao, Yangyu Huang, Wei Hu, Fan Zhang, Jun Liu

Figure 1 for MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer

Figure 2 for MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer

Figure 3 for MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer

Figure 4 for MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer

Abstract:The recently proposed data augmentation TransMix employs attention labels to help visual transformers (ViT) achieve better robustness and performance. However, TransMix is deficient in two aspects: 1) The image cropping method of TransMix may not be suitable for vision transformer. 2) At the early stage of training, the model produces unreliable attention maps. TransMix uses unreliable attention maps to compute mixed attention labels that can affect the model. To address the aforementioned issues, we propose MaskMix and Progressive Attention Labeling (PAL) in image and label space, respectively. In detail, from the perspective of image space, we design MaskMix, which mixes two images based on a patch-like grid mask. In particular, the size of each mask patch is adjustable and is a multiple of the image patch size, which ensures each image patch comes from only one image and contains more global contents. From the perspective of label space, we design PAL, which utilizes a progressive factor to dynamically re-weight the attention weights of the mixed attention label. Finally, we combine MaskMix and Progressive Attention Labeling as our new data augmentation method, named MixPro. The experimental results show that our method can improve various ViT-based models at scales on ImageNet classification (73.8\% top-1 accuracy based on DeiT-T for 300 epochs). After being pre-trained with MixPro on ImageNet, the ViT-based models also demonstrate better transferability to semantic segmentation, object detection, and instance segmentation. Furthermore, compared to TransMix, MixPro also shows stronger robustness on several benchmarks. The code will be released at https://github.com/fistyee/MixPro.

* ICLR 2023, 16 pages, 6 figures. arXiv admin note: text overlap with arXiv:2111.09833 by other authors

Via

Access Paper or Ask Questions

Deep Active Alignment of Knowledge Graph Entities and Schemata

Apr 19, 2023

Jiacheng Huang, Zequn Sun, Qijin Chen, Xiaozhou Xu, Weijun Ren, Wei Hu

Figure 1 for Deep Active Alignment of Knowledge Graph Entities and Schemata

Figure 2 for Deep Active Alignment of Knowledge Graph Entities and Schemata

Figure 3 for Deep Active Alignment of Knowledge Graph Entities and Schemata

Figure 4 for Deep Active Alignment of Knowledge Graph Entities and Schemata

Abstract:Knowledge graphs (KGs) store rich facts about the real world. In this paper, we study KG alignment, which aims to find alignment between not only entities but also relations and classes in different KGs. Alignment at the entity level can cross-fertilize alignment at the schema level. We propose a new KG alignment approach, called DAAKG, based on deep learning and active learning. With deep learning, it learns the embeddings of entities, relations and classes, and jointly aligns them in a semi-supervised manner. With active learning, it estimates how likely an entity, relation or class pair can be inferred, and selects the best batch for human labeling. We design two approximation algorithms for efficient solution to batch selection. Our experiments on benchmark datasets show the superior accuracy and generalization of DAAKG and validate the effectiveness of all its modules.

* Accepted in the ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD 2023)

Via

Access Paper or Ask Questions

Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection

Apr 19, 2023

Qianjiang Hu, Daizong Liu, Wei Hu

Figure 1 for Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection

Figure 2 for Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection

Figure 3 for Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection

Figure 4 for Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection

Abstract:3D object detection from point clouds is crucial in safety-critical autonomous driving. Although many works have made great efforts and achieved significant progress on this task, most of them suffer from expensive annotation cost and poor transferability to unknown data due to the domain gap. Recently, few works attempt to tackle the domain gap in objects, but still fail to adapt to the gap of varying beam-densities between two domains, which is critical to mitigate the characteristic differences of the LiDAR collectors. To this end, we make the attempt to propose a density-insensitive domain adaption framework to address the density-induced domain gap. In particular, we first introduce Random Beam Re-Sampling (RBRS) to enhance the robustness of 3D detectors trained on the source domain to the varying beam-density. Then, we take this pre-trained detector as the backbone model, and feed the unlabeled target domain data into our newly designed task-specific teacher-student framework for predicting its high-quality pseudo labels. To further adapt the property of density-insensitivity into the target domain, we feed the teacher and student branches with the same sample of different densities, and propose an Object Graph Alignment (OGA) module to construct two object-graphs between the two branches for enforcing the consistency in both the attribute and relation of cross-density objects. Experimental results on three widely adopted 3D object detection datasets demonstrate that our proposed domain adaption method outperforms the state-of-the-art methods, especially over varying-density data. Code is available at https://github.com/WoodwindHu/DTS}{https://github.com/WoodwindHu/DTS.

* Accepted by CVPR2023

Via

Access Paper or Ask Questions

Heterogeneous Federated Knowledge Graph Embedding Learning and Unlearning

Feb 26, 2023

Xiangrong Zhu, Guangyao Li, Wei Hu

Figure 1 for Heterogeneous Federated Knowledge Graph Embedding Learning and Unlearning

Figure 2 for Heterogeneous Federated Knowledge Graph Embedding Learning and Unlearning

Figure 3 for Heterogeneous Federated Knowledge Graph Embedding Learning and Unlearning

Figure 4 for Heterogeneous Federated Knowledge Graph Embedding Learning and Unlearning

Abstract:Federated Learning (FL) recently emerges as a paradigm to train a global machine learning model across distributed clients without sharing raw data. Knowledge Graph (KG) embedding represents KGs in a continuous vector space, serving as the backbone of many knowledge-driven applications. As a promising combination, federated KG embedding can fully take advantage of knowledge learned from different clients while preserving the privacy of local data. However, realistic problems such as data heterogeneity and knowledge forgetting still remain to be concerned. In this paper, we propose FedLU, a novel FL framework for heterogeneous KG embedding learning and unlearning. To cope with the drift between local optimization and global convergence caused by data heterogeneity, we propose mutual knowledge distillation to transfer local knowledge to global, and absorb global knowledge back. Moreover, we present an unlearning method based on cognitive neuroscience, which combines retroactive interference and passive decay to erase specific knowledge from local clients and propagate to the global model by reusing knowledge distillation. We construct new datasets for assessing realistic performance of the state-of-the-arts. Extensive experiments show that FedLU achieves superior results in both link prediction and knowledge forgetting.

* Accepted in the ACM Web Conference (WWW 2023)

Via

Access Paper or Ask Questions

LS-DYNA Machine Learning-based Multiscale Method for Nonlinear Modeling of Short Fiber-Reinforced Composites

Jan 06, 2023

Haoyan Wei, C. T. Wu, Wei Hu, Tung-Huan Su, Hitoshi Oura, Masato Nishi, Tadashi Naito, Stan Chung, Leo Shen

Abstract:Short-fiber-reinforced composites (SFRC) are high-performance engineering materials for lightweight structural applications in the automotive and electronics industries. Typically, SFRC structures are manufactured by injection molding, which induces heterogeneous microstructures, and the resulting nonlinear anisotropic behaviors are challenging to predict by conventional micromechanical analyses. In this work, we present a machine learning-based multiscale method by integrating injection molding-induced microstructures, material homogenization, and Deep Material Network (DMN) in the finite element simulation software LS-DYNA for structural analysis of SFRC. DMN is a physics-embedded machine learning model that learns the microscale material morphologies hidden in representative volume elements of composites through offline training. By coupling DMN with finite elements, we have developed a highly accurate and efficient data-driven approach, which predicts nonlinear behaviors of composite materials and structures at a computational speed orders-of-magnitude faster than the high-fidelity direct numerical simulation. To model industrial-scale SFRC products, transfer learning is utilized to generate a unified DMN database, which effectively captures the effects of injection molding-induced fiber orientations and volume fractions on the overall composite properties. Numerical examples are presented to demonstrate the promising performance of this LS-DYNA machine learning-based multiscale method for SFRC modeling.

* Journal of Engineering Mechanics, 2023, 149(3): 04023003
* Final version of this manuscript is published in Journal of Engineering Mechanics. Wei, H., Wu, C. T., Hu, W., Su, T. H., Oura H., Nishi, M., Naito T., Chung S., Shen L. (2023). LS-DYNA machine learning-based multiscale method for nonlinear modeling of short-fiber-reinforced composites. Journal of Engineering Mechanics. 149(3): 04023003. https://doi.org/10.1061/JENMDT.EMENG-6945

Via

Access Paper or Ask Questions

Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey

Dec 09, 2022

Yuxin Wang, Jieru Lin, Zhiwei Yu, Wei Hu, Börje F. Karlsson

Abstract:Storytelling and narrative are fundamental to human experience, intertwined with our social and cultural engagement. As such, researchers have long attempted to create systems that can generate stories automatically. In recent years, powered by deep learning and massive data resources, automatic story generation has shown significant advances. However, considerable challenges, like the need for global coherence in generated stories, still hamper generative models from reaching the same storytelling ability as human narrators. To tackle these challenges, many studies seek to inject structured knowledge into the generation process, which is referred to as structure knowledge-enhanced story generation. Incorporating external knowledge can enhance the logical coherence among story events, achieve better knowledge grounding, and alleviate over-generalization and repetition problems in stories. This survey provides the latest and comprehensive review of this research field: (i) we present a systematical taxonomy regarding how existing methods integrate structured knowledge into story generation; (ii) we summarize involved story corpora, structured knowledge datasets, and evaluation metrics; (iii) we give multidimensional insights into the challenges of knowledge-enhanced story generation and cast light on promising directions for future study.

Via

Access Paper or Ask Questions

Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Nov 29, 2022

Yuanning Cui, Yuxin Wang, Zequn Sun, Wenqiang Liu, Yiqiao Jiang, Kexin Han, Wei Hu

Figure 1 for Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Figure 2 for Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Figure 3 for Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Figure 4 for Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Abstract:Existing knowledge graph (KG) embedding models have primarily focused on static KGs. However, real-world KGs do not remain static, but rather evolve and grow in tandem with the development of KG applications. Consequently, new facts and previously unseen entities and relations continually emerge, necessitating an embedding model that can quickly learn and transfer new knowledge through growth. Motivated by this, we delve into an expanding field of KG embedding in this paper, i.e., lifelong KG embedding. We consider knowledge transfer and retention of the learning on growing snapshots of a KG without having to learn embeddings from scratch. The proposed model includes a masked KG autoencoder for embedding learning and update, with an embedding transfer strategy to inject the learned knowledge into the new entity and relation embeddings, and an embedding regularization method to avoid catastrophic forgetting. To investigate the impacts of different aspects of KG growth, we construct four datasets to evaluate the performance of lifelong KG embedding. Experimental results show that the proposed model outperforms the state-of-the-art inductive and lifelong embedding baselines.

* Accepted in the 37th AAAI Conference on Artificial Intelligence (AAAI 2023)

Via

Access Paper or Ask Questions

Learning Latent Part-Whole Hierarchies for Point Clouds

Nov 14, 2022

Xiang Gao, Wei Hu, Renjie Liao

Abstract:Strong evidence suggests that humans perceive the 3D world by parsing visual scenes and objects into part-whole hierarchies. Although deep neural networks have the capability of learning powerful multi-level representations, they can not explicitly model part-whole hierarchies, which limits their expressiveness and interpretability in processing 3D vision data such as point clouds. To this end, we propose an encoder-decoder style latent variable model that explicitly learns the part-whole hierarchies for the multi-level point cloud segmentation. Specifically, the encoder takes a point cloud as input and predicts the per-point latent subpart distribution at the middle level. The decoder takes the latent variable and the feature from the encoder as an input and predicts the per-point part distribution at the top level. During training, only annotated part labels at the top level are provided, thus making the whole framework weakly supervised. We explore two kinds of approximated inference algorithms, i.e., most-probable-latent and Monte Carlo methods, and three stochastic gradient estimations for learning discrete latent variables, i.e., straight-through, REINFORCE, and pathwise estimators. Experimental results on the PartNet dataset show that the proposed method achieves state-of-the-art performance in not only top-level part segmentation but also middle-level latent subpart segmentation.

Via

Access Paper or Ask Questions

EventEA: Benchmarking Entity Alignment for Event-centric Knowledge Graphs

Nov 05, 2022

Xiaobin Tian, Zequn Sun, Guangyao Li, Wei Hu

Abstract:Entity alignment is to find identical entities in different knowledge graphs (KGs) that refer to the same real-world object. Embedding-based entity alignment techniques have been drawing a lot of attention recently because they can help solve the issue of symbolic heterogeneity in different KGs. However, in this paper, we show that the progress made in the past was due to biased and unchallenging evaluation. We highlight two major flaws in existing datasets that favor embedding-based entity alignment techniques, i.e., the isomorphic graph structures in relation triples and the weak heterogeneity in attribute triples. Towards a critical evaluation of embedding-based entity alignment methods, we construct a new dataset with heterogeneous relations and attributes based on event-centric KGs. We conduct extensive experiments to evaluate existing popular methods, and find that they fail to achieve promising performance. As a new approach to this difficult problem, we propose a time-aware literal encoder for entity alignment. The dataset and source code are publicly available to foster future research. Our work calls for more effective and practical embedding-based solutions to entity alignment.

* submitted to ISWC 2022

Via

Access Paper or Ask Questions