Under circumstances of heterophily, where nodes with different labels tend to be connected based on semantic meanings, Graph Neural Networks (GNNs) often exhibit suboptimal performance. Current studies on graph heterophily mainly focus on aggregation calibration or neighbor extension and address the heterophily issue by utilizing node features or structural information to improve GNN representations. In this paper, we propose and demonstrate that the valuable semantic information inherent in heterophily can be utilized effectively in graph learning by investigating the distribution of neighbors for each individual node within the graph. The theoretical analysis is carried out to demonstrate the efficacy of the idea in enhancing graph learning. Based on this analysis, we propose HiGNN, an innovative approach that constructs an additional new graph structure, that integrates heterophilous information by leveraging node distribution to enhance connectivity between nodes that share similar semantic characteristics. We conduct empirical assessments on node classification tasks using both homophilous and heterophilous benchmark datasets and compare HiGNN to popular GNN baselines and SoTA methods, confirming the effectiveness in improving graph representations. In addition, by incorporating heterophilous information, we demonstrate a notable enhancement in existing GNN-based approaches, and the homophily degree across real-world datasets, thus affirming the efficacy of our approach.
This paper proposes the DistillCSE framework, which performs contrastive learning under the self-training paradigm with knowledge distillation. The potential advantage of DistillCSE is its self-enhancing feature: using a base model to provide additional supervision signals, a stronger model may be learned through knowledge distillation. However, the vanilla DistillCSE through the standard implementation of knowledge distillation only achieves marginal improvements due to severe overfitting. The further quantitative analyses demonstrate the reason that the standard knowledge distillation exhibits a relatively large variance of the teacher model's logits due to the essence of contrastive learning. To mitigate the issue induced by high variance, this paper accordingly proposed two simple yet effective solutions for knowledge distillation: a Group-P shuffling strategy as an implicit regularization and the averaging logits from multiple teacher components. Experiments on standard benchmarks demonstrate that the proposed DistillCSE outperforms many strong baseline methods and yields a new state-of-the-art performance.
Back translation (BT) is one of the most significant technologies in NMT research fields. Existing attempts on BT share a common characteristic: they employ either beam search or random sampling to generate synthetic data with a backward model but seldom work studies the role of synthetic data in the performance of BT. This motivates us to ask a fundamental question: {\em what kind of synthetic data contributes to BT performance?} Through both theoretical and empirical studies, we identify two key factors on synthetic data controlling the back-translation NMT performance, which are quality and importance. Furthermore, based on our findings, we propose a simple yet effective method to generate synthetic data to better trade off both factors so as to yield a better performance for BT. We run extensive experiments on WMT14 DE-EN, EN-DE, and RU-EN benchmark tasks. By employing our proposed method to generate synthetic data, our BT model significantly outperforms the standard BT baselines (i.e., beam and sampling based methods for data generation), which proves the effectiveness of our proposed methods.
Cross-domain pedestrian detection aims to generalize pedestrian detectors from one label-rich domain to another label-scarce domain, which is crucial for various real-world applications. Most recent works focus on domain alignment to train domain-adaptive detectors either at the instance level or image level. From a practical point of view, one-stage detectors are faster. Therefore, we concentrate on designing a cross-domain algorithm for rapid one-stage detectors that lacks instance-level proposals and can only perform image-level feature alignment. However, pure image-level feature alignment causes the foreground-background misalignment issue to arise, i.e., the foreground features in the source domain image are falsely aligned with background features in the target domain image. To address this issue, we systematically analyze the importance of foreground and background in image-level cross-domain alignment, and learn that background plays a more critical role in image-level cross-domain alignment. Therefore, we focus on cross-domain background feature alignment while minimizing the influence of foreground features on the cross-domain alignment stage. This paper proposes a novel framework, namely, background-focused distribution alignment (BFDA), to train domain adaptive onestage pedestrian detectors. Specifically, BFDA first decouples the background features from the whole image feature maps and then aligns them via a novel long-short-range discriminator.
This paper aims to improve contrastive learning for sentence embeddings from two perspectives: handling dropout noise and addressing feature corruption. Specifically, for the first perspective, we identify that the dropout noise from negative pairs affects the model's performance. Therefore, we propose a simple yet effective method to deal with such type of noise. Secondly, we pinpoint the rank bottleneck of current solutions to feature corruption and propose a dimension-wise contrastive learning objective to address this issue. Both proposed methods are generic and can be applied to any contrastive learning based models for sentence embeddings. Experimental results on standard benchmarks demonstrate that combining both proposed methods leads to a gain of 1.8 points compared to the strong baseline SimCSE configured with BERT base. Furthermore, applying the proposed method to DiffCSE, another strong contrastive learning based baseline, results in a gain of 1.4 points.
With the rapid development of information nowadays, spectrum resources are becoming more and more scarce, leading to a shift in the research direction from the modulation classification of a single signal to the modulation classification of multiple signals on the same channel. Therefore, the emergence of an effective mixed signals automatic modulation classification technology have important significance. Considering that NOMA technology has deeper requirements for the modulation classification of mixed signals under different power, this paper mainly introduces and uses a variety of deep learning networks to classify such mixed signals. First, the modulation classification of a single signal based on the existing CNN model is reproduced. We then develop new methods to improve the basic CNN structure and apply it to the modulation classification of mixed signals. Meanwhile, the effects of the number of training sets, the type of training sets and the training methods on the recognition accuracy of mixed signals are studied. Second, we investigate some deep learning models based on CNN (ResNet34, hierarchical structure) and other deep learning models (LSTM, CLDNN). It can be seen although the time and space complexity of these algorithms have increased, different deep learning models have different effects on the modulation classification problem of mixed signals at different power. Generally speaking, higher accuracy gains can be achieved.
Knowledge graph embedding (KGE), aiming to embed entities and relations into low-dimensional vectors, has attracted wide attention recently. However, the existing research is mainly based on the black-box neural models, which makes it difficult to interpret the learned representation. In this paper, we introduce DisenE, an end-to-end framework to learn disentangled knowledge graph embeddings. Specially, we introduce an attention-based mechanism that enables the model to explicitly focus on relevant components of entity embeddings according to a given relation. Furthermore, we introduce two novel regularizers to encourage each component of the entity representation to independently reflect an isolated semantic aspect. Experimental results demonstrate that our proposed DisenE investigates a perspective to address the interpretability of KGE and is proved to be an effective way to improve the performance of link prediction tasks.
Hierarchical multi-label text classification(HMTC) problems become popular recently because of its practicality. Most existing algorithms for HMTC focus on the design of classifiers, and are largely referred to as local, global, or a combination of local/global approaches. However, a few studies have started exploring hierarchical feature extraction based on the label hierarchy associating with text in HMTC. In this paper, a \textbf{N}eural network-based method called \textbf{LA-HCN} is proposed where a novel \textbf{L}abel-based \textbf{A}ttention module is designed to hierarchically extract important information from the text based on different labels. Besides, local and global document embeddings are separately generated to support the respective local and global classifications. In our experiments, LA-HCN achieves the top performance on the four public HMTC datasets when compared with other neural network-based state-of-the-art algorithms. The comparison between LA-HCN with its variants also demonstrates the effectiveness of the proposed label-based attention module as well as the use of the combination of local and global classifications. By visualizing the learned attention(words), we find LA-HCN is able to extract meaningful but different information from text based on different labels which is helpful for human understanding and explanation of classification results.