Graph embedding is essential for graph mining tasks. With the prevalence of graph data in real-world applications, many methods have been proposed in recent years to learn high-quality graph embedding vectors various types of graphs. However, most existing methods usually randomly select the negative samples from the original graph to enhance the training data without considering the noise. In addition, most of these methods only focus on the explicit graph structures and cannot fully capture complex semantics of edges such as various relationships or asymmetry. In order to address these issues, we propose a robust and generalized framework for adversarial graph embedding based on generative adversarial networks. Inspired by generative adversarial network, we propose a robust and generalized framework for adversarial graph embedding, named AGE. AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution, and enables the discriminator and generator to jointly learn each node's robust and generalized representation. Based on this framework, we propose three models to handle three types of graph data and derive the corresponding optimization algorithms, i.e., UG-AGE and DG-AGE for undirected and directed homogeneous graphs, respectively, and HIN-AGE for heterogeneous information networks. Extensive experiments show that our methods consistently and significantly outperform existing state-of-the-art methods across multiple graph mining tasks, including link prediction, node classification, and graph reconstruction.
Graph representation learning has achieved great success in many areas, including e-commerce, chemistry, biology, etc. However, the fundamental problem of choosing the appropriate dimension of node embedding for a given graph still remains unsolved. The commonly used strategies for Node Embedding Dimension Selection (NEDS) based on grid search or empirical knowledge suffer from heavy computation and poor model performance. In this paper, we revisit NEDS from the perspective of minimum entropy principle. Subsequently, we propose a novel Minimum Graph Entropy (MinGE) algorithm for NEDS with graph data. To be specific, MinGE considers both feature entropy and structure entropy on graphs, which are carefully designed according to the characteristics of the rich information in them. The feature entropy, which assumes the embeddings of adjacent nodes to be more similar, connects node features and link topology on graphs. The structure entropy takes the normalized degree as basic unit to further measure the higher-order structure of graphs. Based on them, we design MinGE to directly calculate the ideal node embedding dimension for any graph. Finally, comprehensive experiments with popular Graph Neural Networks (GNNs) on benchmark datasets demonstrate the effectiveness and generalizability of our proposed MinGE.
Recent years have witnessed the fast development of the emerging topic of Graph Learning based Recommender Systems (GLRS). GLRS employ advanced graph learning approaches to model users' preferences and intentions as well as items' characteristics for recommendations. Differently from other RS approaches, including content-based filtering and collaborative filtering, GLRS are built on graphs where the important objects, e.g., users, items, and attributes, are either explicitly or implicitly connected. With the rapid development of graph learning techniques, exploring and exploiting homogeneous or heterogeneous relations in graphs are a promising direction for building more effective RS. In this paper, we provide a systematic review of GLRS, by discussing how they extract important knowledge from graph-based representations to improve the accuracy, reliability and explainability of the recommendations. First, we characterize and formalize GLRS, and then summarize and categorize the key challenges and main progress in this novel research area. Finally, we share some new research directions in this vibrant area.
With recent advances in data collection from multiple sources, multi-view data has received significant attention. In multi-view data, each view represents a different perspective of data. Since label information is often expensive to acquire, multi-view clustering has gained growing interest, which aims to obtain better clustering solution by exploiting complementary and consistent information across all views rather than only using an individual view. Due to inevitable sensor failures, data in each view may contain error. Error often exhibits as noise or feature-specific corruptions or outliers. Multi-view data may contain any or combination of these error types. Blindly clustering multi-view data i.e., without considering possible error in view(s) could significantly degrade the performance. The goal of error-robust multi-view clustering is to obtain useful outcome even if the multi-view data is corrupted. Existing error-robust multi-view clustering approaches with explicit error removal formulation can be structured into five broad research categories - sparsity norm based approaches, graph based methods, subspace based learning approaches, deep learning based methods and hybrid approaches, this survey summarizes and reviews recent advances in error-robust clustering for multi-view data. Finally, we highlight the challenges and provide future research opportunities.
Sequential Recommendation characterizes the evolving patterns by modeling item sequences chronologically. The essential target of it is to capture the item transition correlations. The recent developments of transformer inspire the community to design effective sequence encoders, \textit{e.g.,} SASRec and BERT4Rec. However, we observe that these transformer-based models suffer from the cold-start issue, \textit{i.e.,} performing poorly for short sequences. Therefore, we propose to augment short sequences while still preserving original sequential correlations. We introduce a new framework for \textbf{A}ugmenting \textbf{S}equential \textbf{Re}commendation with \textbf{P}seudo-prior items~(ASReP). We firstly pre-train a transformer with sequences in a reverse direction to predict prior items. Then, we use this transformer to generate fabricated historical items at the beginning of short sequences. Finally, we fine-tune the transformer using these augmented sequences from the time order to predict the next item. Experiments on two real-world datasets verify the effectiveness of ASReP. The code is available on \url{https://github.com/DyGRec/ASReP}.
Disinformation and fake news have posed detrimental effects on individuals and society in recent years, attracting broad attention to fake news detection. The majority of existing fake news detection algorithms focus on mining news content and/or the surrounding exogenous context for discovering deceptive signals; while the endogenous preference of a user when he/she decides to spread a piece of fake news or not is ignored. The confirmation bias theory has indicated that a user is more likely to spread a piece of fake news when it confirms his/her existing beliefs/preferences. Users' historical, social engagements such as posts provide rich information about users' preferences toward news and have great potential to advance fake news detection. However, the work on exploring user preference for fake news detection is somewhat limited. Therefore, in this paper, we study the novel problem of exploiting user preference for fake news detection. We propose a new framework, UPFD, which simultaneously captures various signals from user preferences by joint content and graph modeling. Experimental results on real-world datasets demonstrate the effectiveness of the proposed framework. We release our code and data as a benchmark for GNN-based fake news detection: https://github.com/safe-graph/GNN-FakeNews.
Graph neural networks (GNNs) have been widely used in deep learning on graphs. They can learn effective node representations that achieve superior performances in graph analysis tasks such as node classification and node clustering. However, most methods ignore the heterogeneity in real-world graphs. Methods designed for heterogeneous graphs, on the other hand, fail to learn complex semantic representations because they only use meta-paths instead of meta-graphs. Furthermore, they cannot fully capture the content-based correlations between nodes, as they either do not use the self-attention mechanism or only use it to consider the immediate neighbors of each node, ignoring the higher-order neighbors. We propose a novel Higher-order Attribute-Enhancing (HAE) framework that enhances node embedding in a layer-by-layer manner. Under the HAE framework, we propose a Higher-order Attribute-Enhancing Graph Neural Network (HAEGNN) for heterogeneous network representation learning. HAEGNN simultaneously incorporates meta-paths and meta-graphs for rich, heterogeneous semantics, and leverages the self-attention mechanism to explore content-based nodes interactions. The unique higher-order architecture of HAEGNN allows examining the first-order as well as higher-order neighborhoods. Moreover, HAEGNN shows good explainability as it learns the importances of different meta-paths and meta-graphs. HAEGNN is also memory-efficient, for it avoids per meta-path based matrix calculation. Experimental results not only show HAEGNN superior performance against the state-of-the-art methods in node classification, node clustering, and visualization, but also demonstrate its superiorities in terms of memory efficiency and explainability.
Graph Neural Networks (GNNs) have been widely used for the representation learning of various structured graph data, typically through message passing among nodes by aggregating their neighborhood information via different operations. While promising, most existing GNNs oversimplified the complexity and diversity of the edges in the graph, and thus inefficient to cope with ubiquitous heterogeneous graphs, which are typically in the form of multi-relational graph representations. In this paper, we propose RioGNN, a novel Reinforced, recursive and flexible neighborhood selection guided multi-relational Graph Neural Network architecture, to navigate complexity of neural network structures whilst maintaining relation-dependent representations. We first construct a multi-relational graph, according to the practical task, to reflect the heterogeneity of nodes, edges, attributes and labels. To avoid the embedding over-assimilation among different types of nodes, we employ a label-aware neural similarity measure to ascertain the most similar neighbors based on node attributes. A reinforced relation-aware neighbor selection mechanism is developed to choose the most similar neighbors of a targeting node within a relation before aggregating all neighborhood information from different relations to obtain the eventual node embedding. Particularly, to improve the efficiency of neighbor selecting, we propose a new recursive and scalable reinforcement learning framework with estimable depth and width for different scales of multi-relational graphs. RioGNN can learn more discriminative node embedding with enhanced explainability due to the recognition of individual importance of each relation via the filtering threshold mechanism.