Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei Wu

Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Feb 24, 2023
Yonghao Liu, Di Liang, Fang Fang, Sirui Wang, Wei Wu, Rui Jiang

Figure 1 for Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Figure 2 for Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Figure 3 for Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Figure 4 for Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Knowledge graphs (KGs) have received increasing attention due to its wide applications on natural language processing. However, its use case on temporal question answering (QA) has not been well-explored. Most of existing methods are developed based on pre-trained language models, which might not be capable to learn \emph{temporal-specific} presentations of entities in terms of temporal KGQA task. To alleviate this problem, we propose a novel \textbf{T}ime-aware \textbf{M}ultiway \textbf{A}daptive (\textbf{TMA}) fusion network. Inspired by the step-by-step reasoning behavior of humans. For each given question, TMA first extracts the relevant concepts from the KG, and then feeds them into a multiway adaptive module to produce a \emph{temporal-specific} representation of the question. This representation can be incorporated with the pre-trained KG embedding to generate the final prediction. Empirical results verify that the proposed model achieves better performance than the state-of-the-art models in the benchmark dataset. Notably, the Hits@1 and Hits@10 results of TMA on the CronQuestions dataset's complex questions are absolutely improved by 24\% and 10\% compared to the best-performing baseline. Furthermore, we also show that TMA employing an adaptive fusion mechanism can provide interpretability by analyzing the proportion of information in question representations.

* ICASSP 2023
* ICASSP 2023

Via

Access Paper or Ask Questions

Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition

Feb 14, 2023
Chengcheng Han, Renyu Zhu, Jun Kuang, FengJiao Chen, Xiang Li, Ming Gao, Xuezhi Cao, Wei Wu

Figure 1 for Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition

Figure 2 for Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition

Figure 3 for Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition

Figure 4 for Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition

Meta-learning methods have been widely used in few-shot named entity recognition (NER), especially prototype-based methods. However, the Other(O) class is difficult to be represented by a prototype vector because there are generally a large number of samples in the class that have miscellaneous semantics. To solve the problem, we propose MeTNet, which generates prototype vectors for entity types only but not O-class. We design an improved triplet network to map samples and prototype vectors into a low-dimensional space that is easier to be classified and propose an adaptive margin for each entity type. The margin plays as a radius and controls a region with adaptive size in the low-dimensional space. Based on the regions, we propose a new inference procedure to predict the label of a query instance. We conduct extensive experiments in both in-domain and cross-domain settings to show the superiority of MeTNet over other state-of-the-art methods. In particular, we release a Chinese few-shot NER dataset FEW-COMM extracted from a well-known e-commerce platform. To the best of our knowledge, this is the first Chinese few-shot NER dataset. All the datasets and codes are provided at https://github.com/hccngu/MeTNet.

Via

Access Paper or Ask Questions

FFHR: Fully and Flexible Hyperbolic Representation for Knowledge Graph Completion

Feb 07, 2023
Wentao Shi, Junkang Wu, Xuezhi Cao, Jiawei Chen, Wenqiang Lei, Wei Wu, Xiangnan He

Figure 1 for FFHR: Fully and Flexible Hyperbolic Representation for Knowledge Graph Completion

Figure 2 for FFHR: Fully and Flexible Hyperbolic Representation for Knowledge Graph Completion

Figure 3 for FFHR: Fully and Flexible Hyperbolic Representation for Knowledge Graph Completion

Figure 4 for FFHR: Fully and Flexible Hyperbolic Representation for Knowledge Graph Completion

Learning hyperbolic embeddings for knowledge graph (KG) has gained increasing attention due to its superiority in capturing hierarchies. However, some important operations in hyperbolic space still lack good definitions, making existing methods unable to fully leverage the merits of hyperbolic space. Specifically, they suffer from two main limitations: 1) existing Graph Convolutional Network (GCN) methods in hyperbolic space rely on tangent space approximation, which would incur approximation error in representation learning, and 2) due to the lack of inner product operation definition in hyperbolic space, existing methods can only measure the plausibility of facts (links) with hyperbolic distance, which is difficult to capture complex data patterns. In this work, we contribute: 1) a Full Poincar\'{e} Multi-relational GCN that achieves graph information propagation in hyperbolic space without requiring any approximation, and 2) a hyperbolic generalization of Euclidean inner product that is beneficial to capture both hierarchical and complex patterns. On this basis, we further develop a \textbf{F}ully and \textbf{F}lexible \textbf{H}yperbolic \textbf{R}epresentation framework (\textbf{FFHR}) that is able to transfer recent Euclidean-based advances to hyperbolic space. We demonstrate it by instantiating FFHR with four representative KGC methods. Extensive experiments on benchmark datasets validate the superiority of our FFHRs over their Euclidean counterparts as well as state-of-the-art hyperbolic embedding methods.

* submit to TKDE

Via

Access Paper or Ask Questions

RGB-T Multi-Modal Crowd Counting Based on Transformer

Jan 08, 2023
Zhengyi Liu, Wei Wu, Yacheng Tan, Guanghui Zhang

Figure 1 for RGB-T Multi-Modal Crowd Counting Based on Transformer

Figure 2 for RGB-T Multi-Modal Crowd Counting Based on Transformer

Figure 3 for RGB-T Multi-Modal Crowd Counting Based on Transformer

Figure 4 for RGB-T Multi-Modal Crowd Counting Based on Transformer

Crowd counting aims to estimate the number of persons in a scene. Most state-of-the-art crowd counting methods based on color images can't work well in poor illumination conditions due to invisible objects. With the widespread use of infrared cameras, crowd counting based on color and thermal images is studied. Existing methods only achieve multi-modal fusion without count objective constraint. To better excavate multi-modal information, we use count-guided multi-modal fusion and modal-guided count enhancement to achieve the impressive performance. The proposed count-guided multi-modal fusion module utilizes a multi-scale token transformer to interact two-modal information under the guidance of count information and perceive different scales from the token perspective. The proposed modal-guided count enhancement module employs multi-scale deformable transformer decoder structure to enhance one modality feature and count information by the other modality. Experiment in public RGBT-CC dataset shows that our method refreshes the state-of-the-art results. https://github.com/liuzywen/RGBTCC

* BMVC2022

Via

Access Paper or Ask Questions

LIDAR GAIT: Benchmarking 3D Gait Recognition with Point Clouds

Nov 19, 2022
Chuanfu Shen, Chao Fan, Wei Wu, Rui Wang, George Q. Huang, Shiqi Yu

Figure 1 for LIDAR GAIT: Benchmarking 3D Gait Recognition with Point Clouds

Figure 2 for LIDAR GAIT: Benchmarking 3D Gait Recognition with Point Clouds

Figure 3 for LIDAR GAIT: Benchmarking 3D Gait Recognition with Point Clouds

Figure 4 for LIDAR GAIT: Benchmarking 3D Gait Recognition with Point Clouds

Video-based gait recognition has achieved impressive results in constrained scenarios. However, visual cameras neglect human 3D structure information, which limits the feasibility of gait recognition in the 3D wild world. In this work, instead of extracting gait features from images, we explore precise 3D gait features from point clouds and propose a simple yet efficient 3D gait recognition framework, termed multi-view projection network (MVPNet). MVPNet first projects point clouds into multiple depth maps from different perspectives, and then fuse depth images together, to learn the compact representation with 3D geometry information. Due to the lack of point cloud datasets, we build the first large-scale Lidar-based gait recognition dataset, LIDAR GAIT, collected by a Lidar sensor and an RGB camera mounted on a robot. The dataset contains 25,279 sequences from 1,050 subjects and covers many different variations, including visibility, views, occlusions, clothing, carrying, and scenes. Extensive experiments show that, (1) 3D structure information serves as a significant feature for gait recognition. (2) MVPNet not only competes with five representative point-based methods, but it also outperforms existing camera-based methods by large margins. (3) The Lidar sensor is superior to the RGB camera for gait recognition in the wild. LIDAR GAIT dataset and MVPNet code will be publicly available.

* 16 pages, 16 figures, 3 tables

Via

Access Paper or Ask Questions

Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

Nov 18, 2022
Yong-Lu Li, Hongwei Fan, Zuoyu Qiu, Yiming Dou, Liang Xu, Hao-Shu Fang, Peiyang Guo, Haisheng Su, Dongliang Wang, Wei Wu, Cewu Lu

Figure 1 for Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

Figure 2 for Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

Figure 3 for Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

Figure 4 for Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

Spatio-temporal Human-Object Interaction (ST-HOI) detection aims at detecting HOIs from videos, which is crucial for activity understanding. In daily HOIs, humans often interact with a variety of objects, e.g., holding and touching dozens of household items in cleaning. However, existing whole body-object interaction video benchmarks usually provide limited object classes. Here, we introduce a new benchmark based on AVA: Discovering Interacted Objects (DIO) including 51 interactions and 1,000+ objects. Accordingly, an ST-HOI learning task is proposed expecting vision systems to track human actors, detect interactions and simultaneously discover interacted objects. Even though today's detectors/trackers excel in object detection/tracking tasks, they perform unsatisfied to localize diverse/unseen objects in DIO. This profoundly reveals the limitation of current vision systems and poses a great challenge. Thus, how to leverage spatio-temporal cues to address object discovery is explored, and a Hierarchical Probe Network (HPN) is devised to discover interacted objects utilizing hierarchical spatio-temporal human/context cues. In extensive experiments, HPN demonstrates impressive performance. Data and code are available at https://github.com/DirtyHarryLYL/HAKE-AVA.

* Techniqual report. A part of the HAKE project. Project: https://github.com/DirtyHarryLYL/HAKE-AVA

Via

Access Paper or Ask Questions

Robust Lottery Tickets for Pre-trained Language Models

Nov 06, 2022
Rui Zheng, Rong Bao, Yuhao Zhou, Di Liang, Sirui Wang, Wei Wu, Tao Gui, Qi Zhang, Xuanjing Huang

Figure 1 for Robust Lottery Tickets for Pre-trained Language Models

Figure 2 for Robust Lottery Tickets for Pre-trained Language Models

Figure 3 for Robust Lottery Tickets for Pre-trained Language Models

Figure 4 for Robust Lottery Tickets for Pre-trained Language Models

Recent works on Lottery Ticket Hypothesis have shown that pre-trained language models (PLMs) contain smaller matching subnetworks(winning tickets) which are capable of reaching accuracy comparable to the original models. However, these tickets are proved to be notrobust to adversarial examples, and even worse than their PLM counterparts. To address this problem, we propose a novel method based on learning binary weight masks to identify robust tickets hidden in the original PLMs. Since the loss is not differentiable for the binary mask, we assign the hard concrete distribution to the masks and encourage their sparsity using a smoothing approximation of L0 regularization.Furthermore, we design an adversarial loss objective to guide the search for robust tickets and ensure that the tickets perform well bothin accuracy and robustness. Experimental results show the significant improvement of the proposed method over previous work on adversarial robustness evaluation.

* ACL 2022. https://aclanthology.org/2022.acl-long.157

Via

Access Paper or Ask Questions

Focus Is What You Need For Chinese Grammatical Error Correction

Oct 27, 2022
Jingheng Ye, Yinghui Li, Shirong Ma, Rui Xie, Wei Wu, Hai-Tao Zheng

Figure 1 for Focus Is What You Need For Chinese Grammatical Error Correction

Figure 2 for Focus Is What You Need For Chinese Grammatical Error Correction

Figure 3 for Focus Is What You Need For Chinese Grammatical Error Correction

Figure 4 for Focus Is What You Need For Chinese Grammatical Error Correction

Chinese Grammatical Error Correction (CGEC) aims to automatically detect and correct grammatical errors contained in Chinese text. In the long term, researchers regard CGEC as a task with a certain degree of uncertainty, that is, an ungrammatical sentence may often have multiple references. However, we argue that even though this is a very reasonable hypothesis, it is too harsh for the intelligence of the mainstream models in this era. In this paper, we first discover that multiple references do not actually bring positive gains to model training. On the contrary, it is beneficial to the CGEC model if the model can pay attention to small but essential data during the training process. Furthermore, we propose a simple yet effective training strategy called OneTarget to improve the focus ability of the CGEC models and thus improve the CGEC performance. Extensive experiments and detailed analyses demonstrate the correctness of our discovery and the effectiveness of our proposed method.

* Submitted to ICASSP2023 (currently under review)

Via

Access Paper or Ask Questions

A Curriculum Learning Approach for Multi-domain Text Classification Using Keyword weight Ranking

Oct 27, 2022
Zilin Yuan, Yinghui Li, Yangning Li, Rui Xie, Wei Wu, Hai-Tao Zheng

Figure 1 for A Curriculum Learning Approach for Multi-domain Text Classification Using Keyword weight Ranking

Figure 2 for A Curriculum Learning Approach for Multi-domain Text Classification Using Keyword weight Ranking

Figure 3 for A Curriculum Learning Approach for Multi-domain Text Classification Using Keyword weight Ranking

Figure 4 for A Curriculum Learning Approach for Multi-domain Text Classification Using Keyword weight Ranking

Text classification is a very classic NLP task, but it has two prominent shortcomings: On the one hand, text classification is deeply domain-dependent. That is, a classifier trained on the corpus of one domain may not perform so well in another domain. On the other hand, text classification models require a lot of annotated data for training. However, for some domains, there may not exist enough annotated data. Therefore, it is valuable to investigate how to efficiently utilize text data from different domains to improve the performance of models in various domains. Some multi-domain text classification models are trained by adversarial training to extract shared features among all domains and the specific features of each domain. We noted that the distinctness of the domain-specific features is different, so in this paper, we propose to use a curriculum learning strategy based on keyword weight ranking to improve the performance of multi-domain text classification models. The experimental results on the Amazon review and FDU-MTL datasets show that our curriculum learning strategy effectively improves the performance of multi-domain text classification models based on adversarial learning and outperforms state-of-the-art methods.

* Submitted to ICASSP2023 (currently under review)

Via

Access Paper or Ask Questions

UniNL: Aligning Representation Learning with Scoring Function for OOD Detection via Unified Neighborhood Learning

Oct 19, 2022
Yutao Mou, Pei Wang, Keqing He, Yanan Wu, Jingang Wang, Wei Wu, Weiran Xu

Figure 1 for UniNL: Aligning Representation Learning with Scoring Function for OOD Detection via Unified Neighborhood Learning

Figure 2 for UniNL: Aligning Representation Learning with Scoring Function for OOD Detection via Unified Neighborhood Learning

Figure 3 for UniNL: Aligning Representation Learning with Scoring Function for OOD Detection via Unified Neighborhood Learning

Figure 4 for UniNL: Aligning Representation Learning with Scoring Function for OOD Detection via Unified Neighborhood Learning

Detecting out-of-domain (OOD) intents from user queries is essential for avoiding wrong operations in task-oriented dialogue systems. The key challenge is how to distinguish in-domain (IND) and OOD intents. Previous methods ignore the alignment between representation learning and scoring function, limiting the OOD detection performance. In this paper, we propose a unified neighborhood learning framework (UniNL) to detect OOD intents. Specifically, we design a K-nearest neighbor contrastive learning (KNCL) objective for representation learning and introduce a KNN-based scoring function for OOD detection. We aim to align representation learning with scoring function. Experiments and analysis on two benchmark datasets show the effectiveness of our method.

* Accepted at EMNLP2022 main conference

Via

Access Paper or Ask Questions