Graph neural networks (GNNs) aim to learn graph representations that preserve both attributive and structural information. In this paper, we study the problem of how to select high-quality nodes for training GNNs, considering GNNs are sensitive to different training datasets. Active learning (AL), whose purpose is to find the most informative instances to maximize the performance of the model, is a promising approach to solve this problem. Previous attempts have combined AL with graph representation learning by designing several selection criteria to measure how informative a node is. However, these methods do not directly utilize both the rich semantic and structural information and are prone to select sparsely-connected nodes (i.e. nodes having few neighbors) and low-purity nodes (i.e. nodes having noisy inter-class edges), which are less effective for training GNN models. To address these problems, we present a Deep Active Graph Representation Learning framework (DAGRL), in which three novel selection criteria are proposed. Specifically, we propose to measure the uncertainty of nodes via random topological perturbation. Besides, we propose two novel representativeness sampling criteria, which utilize both the structural and label information to find densely-connected nodes with many intra-class edges, hence enhance the performance of GNN models significantly. Then, we combine these three criteria with time-sensitive scheduling in accordance to the training progress of GNNs. Furthermore, considering the different size of classes, we employ a novel cluster-aware node selection policy, which ensures the number of selected nodes in each class is proportional to the size of the class. Comprehensive experiments on three public datasets show that our method outperforms previous baselines by a significant margin, which demonstrates its effectiveness.
Recently, contrastive learning (CL) has emerged as a successful method for unsupervised graph representation learning. Most graph CL methods first perform stochastic augmentation on the input graph to obtain two graph views and maximize the agreement of representations in the two views. Despite the prosperous development of graph CL methods, the design of graph augmentation schemes---a crucial component in CL---remains rarely explored. We argue that the data augmentation schemes should preserve intrinsic structural and attribute information of graphs, which will force the model to learn representations that are insensitive to perturbation on unimportant nodes and edges. However, most existing methods adopt uniform data augmentation schemes, like uniformly dropping edges and uniformly shuffling features, leading to suboptimal performance. In this paper, we propose a novel graph contrastive representation learning method with adaptive augmentation that incorporates various priors for topological and semantic aspects of the graph. Specifically, on the topology level, we design augmentation schemes based on node centrality measures to highlight important connective structures. On the node attribute level, we corrupt node features by adding more noise to unimportant node features, to enforce the model to recognize underlying semantic information. We perform extensive experiments of node classification on a variety of real-world datasets. Experimental results demonstrate that our proposed method consistently outperforms existing state-of-the-art methods and even surpasses some supervised counterparts, which validates the effectiveness of the proposed contrastive framework with adaptive augmentation.
The task of session-based recommendation is to predict user actions based on anonymous sessions. Recent research mainly models the target session as a sequence or a graph to capture item transitions within it, ignoring complex transitions between items in different sessions that have been generated by other users. These item transitions include potential collaborative information and reflect similar behavior patterns, which we assume may help with the recommendation for the target session. In this paper, we propose a novel method, namely Dual-channel Graph Transition Network (DGTN), to model item transitions within not only the target session but also the neighbor sessions. Specifically, we integrate the target session and its neighbor (similar) sessions into a single graph. Then the transition signals are explicitly injected into the embedding by channel-aware propagation. Experiments on real-world datasets demonstrate that DGTN outperforms other state-of-the-art methods. Further analysis verifies the rationality of dual-channel item transition modeling, suggesting a potential future direction for session-based recommendation.
Unsupervised graph representation learning aims to learn low-dimensional node embeddings without supervision while preserving graph topological structures and node attributive features. Previous graph neural networks (GNN) require a large number of labeled nodes, which may not be accessible in real-world graph data. In this paper, we present a novel cluster-aware graph neural network (CAGNN) model for unsupervised graph representation learning using self-supervised techniques. In CAGNN, we perform clustering on the node embeddings and update the model parameters by predicting the cluster assignments. Moreover, we observe that graphs often contain inter-class edges, which mislead the GNN model to aggregate noisy information from neighborhood nodes. We further refine the graph topology by strengthening intra-class edges and reducing node connections between different classes based on cluster labels, which better preserves cluster structures in the embedding space. We conduct comprehensive experiments on two benchmark tasks using real-world datasets. The results demonstrate the superior performance of the proposed model over existing baseline methods. Notably, our model gains over 7% improvements in terms of accuracy on node clustering over state-of-the-arts.
3D photography is a new medium that allows viewers to more fully experience a captured moment. In this work, we refer to a 3D photo as one that displays parallax induced by moving the viewpoint (as opposed to a stereo pair with a fixed viewpoint). 3D photos are static in time, like traditional photos, but are displayed with interactive parallax on mobile or desktop screens, as well as on Virtual Reality devices, where viewing it also includes stereo. We present an end-to-end system for creating and viewing 3D photos, and the algorithmic and design choices therein. Our 3D photos are captured in a single shot and processed directly on a mobile device. The method starts by estimating depth from the 2D input image using a new monocular depth estimation network that is optimized for mobile devices. It performs competitively to the state-of-the-art, but has lower latency and peak memory consumption and uses an order of magnitude fewer parameters. The resulting depth is lifted to a layered depth image, and new geometry is synthesized in parallax regions. We synthesize color texture and structures in the parallax regions as well, using an inpainting network, also optimized for mobile devices, on the LDI directly. Finally, we convert the result into a mesh-based representation that can be efficiently transmitted and rendered even on low-end devices and over poor network connections. Altogether, the processing takes just a few seconds on a mobile device, and the result can be instantly viewed and shared. We perform extensive quantitative evaluation to validate our system and compare its new components against the current state-of-the-art.
Graph representation learning nowadays becomes fundamental in analyzing graph-structured data. Inspired by recent success of contrastive methods, in this paper, we propose a novel framework for unsupervised graph representation learning by leveraging a contrastive objective at the node level. Specifically, we generate two graph views by corruption and learn node representations by maximizing the agreement of node representations in these two views. To provide diverse node contexts for the contrastive objective, we propose a hybrid scheme for generating graph views on both structure and attribute levels. Besides, we provide theoretical justification behind our motivation from two perspectives, mutual information and the classical triplet loss. We perform empirical experiments on both transductive and inductive learning tasks using a variety of real-world datasets. Experimental experiments demonstrate that despite its simplicity, our proposed method consistently outperforms existing state-of-the-art methods by large margins. Notably, our method gains about 10% absolute improvements on protein function prediction. Our unsupervised method even surpasses its supervised counterparts on transductive tasks, demonstrating its great potential in real-world applications.
Text classification is fundamental in natural language processing (NLP), and Graph Neural Networks (GNN) are recently applied in this task. However, the existing graph-based works can neither capture the contextual word relationships within each document nor fulfil the inductive learning of new words. In this work, to overcome such problems, we propose TextING for inductive text classification via GNN. We first build individual graphs for each document and then use GNN to learn the fine-grained word representations based on their local structures, which can also effectively produce embeddings for unseen words in the new document. Finally, the word nodes are aggregated as the document embedding. Extensive experiments on four benchmark datasets show that our method outperforms state-of-the-art text classification methods.
Session-based recommendation nowadays plays a vital role in many websites, which aims to predict users' actions based on anonymous sessions. There have emerged many studies that model a session as a sequence or a graph via investigating temporal transitions of items in a session. However, these methods compress a session into one fixed representation vector without considering the target items to be predicted. The fixed vector will restrict the representation ability of the recommender model, considering the diversity of target items and users' interests. In this paper, we propose a novel target attentive graph neural network (TAGNN) model for session-based recommendation. In TAGNN, target-aware attention adaptively activates different user interests with respect to varied target items. The learned interest representation vector varies with different target items, greatly improving the expressiveness of the model. Moreover, TAGNN harnesses the power of graph neural networks to capture rich item transitions in sessions. Comprehensive experiments conducted on real-world datasets demonstrate its superiority over state-of-the-art methods.