Recently, heterogeneous Graph Neural Networks (GNNs) have become a de facto model for analyzing HGs, while most of them rely on a relative large number of labeled data. In this work, we investigate Contrastive Learning (CL), a key component in self-supervised approaches, on HGs to alleviate the label scarcity problem. We first generate multiple semantic views according to metapaths and network schemas. Then, by pushing node embeddings corresponding to different semantic views close to each other (positives) and pulling other embeddings apart (negatives), one can obtain informative representations without human annotations. However, this CL approach ignores the relative hardness of negative samples, which may lead to suboptimal performance. Considering the complex graph structure and the smoothing nature of GNNs, we propose a structure-aware hard negative mining scheme that measures hardness by structural characteristics for HGs. By synthesizing more negative nodes, we give larger weights to harder negatives with limited computational overhead to further boost the performance. Empirical studies on three real-world datasets show the effectiveness of our proposed method. The proposed method consistently outperforms existing state-of-the-art methods and notably, even surpasses several supervised counterparts.
Multi-view network embedding aims at projecting nodes in the network to low-dimensional vectors, while preserving their multiple relations and attribute information. Contrastive learning-based methods have preliminarily shown promising performance in this task. However, most contrastive learning-based methods mostly rely on high-quality graph embedding and explore less on the relationships between different graph views. To deal with these deficiencies, we design a novel node-to-node Contrastive learning framework for Multi-view network Embedding (CREME), which mainly contains two contrastive objectives: Multi-view fusion InfoMax and Inter-view InfoMin. The former objective distills information from embeddings generated from different graph views, while the latter distinguishes different graph views better to capture the complementary information between them. Specifically, we first apply a view encoder to generate each graph view representation and utilize a multi-view aggregator to fuse these representations. Then, we unify the two contrastive objectives into one learning objective for training. Extensive experiments on three real-world datasets show that CREME outperforms existing methods consistently.
Recently, Deep Neural Networks (DNNs) have made remarkable progress for text classification, which, however, still require a large number of labeled data. To train high-performing models with the minimal annotation cost, active learning is proposed to select and label the most informative samples, yet it is still challenging to measure informativeness of samples used in DNNs. In this paper, inspired by piece-wise linear interpretability of DNNs, we propose a novel Active Learning with DivErse iNterpretations (ALDEN) approach. With local interpretations in DNNs, ALDEN identifies linearly separable regions of samples. Then, it selects samples according to their diversity of local interpretations and queries their labels. To tackle the text classification problem, we choose the word with the most diverse interpretations to represent the whole sentence. Extensive experiments demonstrate that ALDEN consistently outperforms several state-of-the-art deep active learning methods.
Recently, Graph Convolution Network (GCN) based methods have achieved outstanding performance for recommendation. These methods embed users and items in Euclidean space, and perform graph convolution on user-item interaction graphs. However, real-world datasets usually exhibit tree-like hierarchical structures, which make Euclidean space less effective in capturing user-item relationship. In contrast, hyperbolic space, as a continuous analogue of a tree-graph, provides a promising alternative. In this paper, we propose a fully hyperbolic GCN model for recommendation, where all operations are performed in hyperbolic space. Utilizing the advantage of hyperbolic space, our method is able to embed users/items with less distortion and capture user-item interaction relationship more accurately. Extensive experiments on public benchmark datasets show that our method outperforms both Euclidean and hyperbolic counterparts and requires far lower embedding dimensionality to achieve comparable performance.
Graph Neural Networks (GNNs) have achieved great success among various domains. Nevertheless, most GNN methods are sensitive to the quality of graph structures. To tackle this problem, some studies exploit different graph structure learning strategies to refine the original graph structure. However, these methods only consider feature information while ignoring available label information. In this paper, we propose a novel label-informed graph structure learning framework which incorporates label information explicitly through a class transition matrix. We conduct extensive experiments on seven node classification benchmark datasets and the results show that our method outperforms or matches the state-of-the-art baselines.
Factorization machine (FM) is a prevalent approach to modeling pairwise (second-order) feature interactions when dealing with high-dimensional sparse data. However, on the one hand, FM fails to capture higher-order feature interactions suffering from combinatorial expansion, on the other hand, taking into account interaction between every pair of features may introduce noise and degrade prediction accuracy. To solve the problems, we propose a novel approach Graph Factorization Machine (GraphFM) by naturally representing features in the graph structure. In particular, a novel mechanism is designed to select the beneficial feature interactions and formulate them as edges between features. Then our proposed model which integrates the interaction function of FM into the feature aggregation strategy of Graph Neural Network (GNN), can model arbitrary-order feature interactions on the graph-structured features by stacking layers. Experimental results on several real-world datasets has demonstrated the rationality and effectiveness of our proposed approach.
Due to the powerful learning ability on high-rank and non-linear features, deep neural networks (DNNs) are being applied to data mining and machine learning in various fields, and exhibit higher discrimination performance than conventional methods. However, the applications based on DNNs are rare in enterprise credit rating tasks because most of DNNs employ the "end-to-end" learning paradigm, which outputs the high-rank representations of objects and predictive results without any explanations. Thus, users in the financial industry cannot understand how these high-rank representations are generated, what do they mean and what relations exist with the raw inputs. Then users cannot determine whether the predictions provided by DNNs are reliable, and not trust the predictions providing by such "black box" models. Therefore, in this paper, we propose a novel network to explicitly model the enterprise credit rating problem using DNNs and attention mechanisms. The proposed model realizes explainable enterprise credit ratings. Experimental results obtained on real-world enterprise datasets verify that the proposed approach achieves higher performance than conventional methods, and provides insights into individual rating results and the reliability of model training.
Item-based collaborative filtering (ICF) has been widely used in industrial applications such as recommender system and online advertising. It models users' preference on target items by the items they have interacted with. Recent models use methods such as attention mechanism and deep neural network to learn the user representation and scoring function more accurately. However, despite their effectiveness, such models still overlook a problem that performance of ICF methods heavily depends on the quality of item representation especially the target item representation. In fact, due to the long-tail distribution in the recommendation, most item embeddings can not represent the semantics of items accurately and thus degrade the performance of current ICF methods. In this paper, we propose an enhanced representation of the target item which distills relevant information from the co-occurrence items. We design sampling strategies to sample fix number of co-occurrence items for the sake of noise reduction and computational cost. Considering the different importance of sampled items to the target item, we apply attention mechanism to selectively adopt the semantic information of the sampled items. Our proposed Co-occurrence based Enhanced Representation model (CER) learns the scoring function by a deep neural network with the attentive user representation and fusion of raw representation and enhanced representation of target item as input. With the enhanced representation, CER has stronger representation power for the tail items compared to the state-of-the-art ICF methods. Extensive experiments on two public benchmarks demonstrate the effectiveness of CER.
Multimedia content is of predominance in the modern Web era. Investigating how users interact with multimodal items is a continuing concern within the rapid development of recommender systems. The majority of previous work focuses on modeling user-item interactions with multimodal features included as side information. However, this scheme is not well-designed for multimedia recommendation. Specifically, only collaborative item-item relationships are implicitly modeled through high-order item-user-item relations. Considering that items are associated with rich contents in multiple modalities, we argue that the latent item-item structures underlying these multimodal contents could be beneficial for learning better item representations and further boosting recommendation. To this end, we propose a LATent sTructure mining method for multImodal reCommEndation, which we term LATTICE for brevity. To be specific, in the proposed LATTICE model, we devise a novel modality-aware structure learning layer, which learns item-item structures for each modality and aggregates multiple modalities to obtain latent item graphs. Based on the learned latent graphs, we perform graph convolutions to explicitly inject high-order item affinities into item representations. These enriched item representations can then be plugged into existing collaborative filtering methods to make more accurate recommendations. Extensive experiments on three real-world datasets demonstrate the superiority of our method over state-of-the-art multimedia recommendation methods and validate the efficacy of mining latent item-item relationships from multimodal features.
Modeling users' preference from his historical sequences is one of the core problem of sequential recommendation. Existing methods in such fields are widely distributed from conventional methods to deep learning methods. However, most of them only model users' interests within their own sequences and ignore the fine-grained utilization of dynamic collaborative signals among different user sequences, making them insufficient to explore users' preferences. We take inspiration from dynamic graph neural networks to cope with this challenge, unifying the user sequence modeling and dynamic interaction information among users into one framework. We propose a new method named \emph{Dynamic Graph Neural Network for Sequential Recommendation} (DGSR), which connects the sequence of different users through a dynamic graph structure, exploring the interactive behavior of users and items with time and order information. Furthermore, we design a Dynamic Graph Attention Neural Network to achieve the information propagation and aggregation among different users and their sequences in the dynamic graph. Consequently, the next-item prediction task in sequential recommendation is converted into a link prediction task for the user node to the item node in a dynamic graph. Extensive experiments on four public benchmarks show that DGSR outperforms several state-of-the-art methods. Further studies demonstrate the rationality and effectiveness of modeling user sequences through a dynamic graph.