Node classification in attributed graphs is an important task in multiple practical settings, but it can often be difficult or expensive to obtain labels. Active learning can improve the achieved classification performance for a given budget on the number of queried labels. The best existing methods are based on graph neural networks, but they often perform poorly unless a sizeable validation set of labelled nodes is available in order to choose good hyperparameters. We propose a novel graph-based active learning algorithm for the task of node classification in attributed graphs; our algorithm uses graph cognizant logistic regression, equivalent to a linearized graph convolutional neural network (GCN), for the prediction phase and maximizes the expected error reduction in the query phase. To reduce the delay experienced by a labeller interacting with the system, we derive a preemptive querying system that calculates a new query during the labelling process, and to address the setting where learning starts with almost no labelled data, we also develop a hybrid algorithm that performs adaptive model averaging of label propagation and linearized GCN inference. We conduct experiments on five public benchmark datasets, demonstrating a significant improvement over state-of-the-art approaches and illustrate the practical value of the method by applying it to a private microwave link network dataset.
Graphs are ubiquitous in modelling relational structures. Recent endeavours in machine learning for graph-structured data have led to many architectures and learning algorithms. However, the graph used by these algorithms is often constructed based on inaccurate modelling assumptions and/or noisy data. As a result, it fails to represent the true relationships between nodes. A Bayesian framework which targets posterior inference of the graph by considering it as a random quantity can be beneficial. In this paper, we propose a novel non-parametric graph model for constructing the posterior distribution of graph adjacency matrices. The proposed model is flexible in the sense that it can effectively take into account the output of graph-based learning algorithms that target specific tasks. In addition, model inference scales well to large graphs. We demonstrate the advantages of this model in three different problem settings: node classification, link prediction and recommendation.
Graph convolutional neural networks (GCNN) have numerous applications in different graph based learning tasks. Although the techniques obtain impressive results, they often fall short in accounting for the uncertainty associated with the underlying graph structure. In the recently proposed Bayesian GCNN (BGCN) framework, this issue is tackled by viewing the observed graph as a sample from a parametric random graph model and targeting joint inference of the graph and the GCNN weights. In this paper, we introduce an alternative generative model for graphs based on copying nodes and incorporate it within the BGCN framework. Our approach has the benefit that it uses information provided by the node features and training labels in the graph topology inference. Experiments show that the proposed algorithm compares favorably to the state-of-the-art in benchmark node classification tasks.
Graph convolutional neural networks (GCNN) have been successfully applied to many different graph based learning tasks including node and graph classification, matrix completion, and learning of node embeddings. Despite their impressive performance, the techniques have a limited capability to incorporate the uncertainty in the underlined graph structure. In order to address this issue, a Bayesian GCNN (BGCN) framework was recently proposed. In this framework, the observed graph is considered to be a random realization from a parametric random graph model and the joint Bayesian inference of the graph and GCNN weights is performed. In this paper, we propose a non-parametric generative model for graphs and incorporate it within the BGCN framework. In addition to the observed graph, our approach effectively uses the node features and training labels in the posterior inference of graphs and attains superior or comparable performance in benchmark node classification tasks.
We address the task of identifying densely connected subsets of multivariate Gaussian random variables within a graphical model framework. We propose two novel estimators based on the Ordered Weighted $\ell_1$ (OWL) norm: 1) The Graphical OWL (GOWL) is a penalized likelihood method that applies the OWL norm to the lower triangle components of the precision matrix. 2) The column-by-column Graphical OWL (ccGOWL) estimates the precision matrix by performing OWL regularized linear regressions. Both methods can simultaneously identify highly correlated groups of variables and control the sparsity in the resulting precision matrix. We formulate GOWL such that it solves a composite optimization problem and establish that the estimator has a unique global solution. In addition, we prove sufficient grouping conditions for each column of the ccGOWL precision matrix estimate. We propose proximal descent algorithms to find the optimum for both estimators. For synthetic data where group structure is present, the ccGOWL estimator requires significantly reduced computation and achieves similar or greater accuracy than state-of-the-art estimators. Timing comparisons are presented and demonstrates the superior computational efficiency of the ccGOWL. We illustrate the grouping performance of the ccGOWL method on a cancer gene expression data set and an equities data set.
Recently, techniques for applying convolutional neural networks to graph-structured data have emerged. Graph convolutional neural networks (GCNNs) have been used to address node and graph classification and matrix completion. Although the performance has been impressive, the current implementations have limited capability to incorporate uncertainty in the graph structure. Almost all GCNNs process a graph as though it is a ground-truth depiction of the relationship between nodes, but often the graphs employed in applications are themselves derived from noisy data or modelling assumptions. Spurious edges may be included; other edges may be missing between nodes that have very strong relationships. In this paper we adopt a Bayesian approach, viewing the observed graph as a realization from a parametric family of random graphs. We then target inference of the joint posterior of the random graph parameters and the node (or graph) labels. We present the Bayesian GCNN framework and develop an iterative learning procedure for the case of assortative mixed-membership stochastic block models. We present the results of experiments that demonstrate that the Bayesian formulation can provide better performance when there are very few labels available during the training process.
Decentralized receding horizon control (D-RHC) provides a mechanism for coordination in multi-agent settings without a centralized command center. However, combining a set of different goals, costs, and constraints to form an efficient optimization objective for D-RHC can be difficult. To allay this problem, we use a meta-learning process -- cost adaptation -- which generates the optimization objective for D-RHC to solve based on a set of human-generated priors (cost and constraint functions) and an auxiliary heuristic. We use this adaptive D-RHC method for control of mesh-networked swarm agents. This formulation allows a wide range of tasks to be encoded and can account for network delays, heterogeneous capabilities, and increasingly large swarms through the adaptation mechanism. We leverage the Unity3D game engine to build a simulator capable of introducing artificial networking failures and delays in the swarm. Using the simulator we validate our method on an example coordinated exploration task. We demonstrate that cost adaptation allows for more efficient and safer task completion under varying environment conditions and increasingly large swarm sizes. We release our simulator and code to the community for future work.
Microwave-based breast cancer detection has been proposed as a complementary approach to compensate for some drawbacks of existing breast cancer detection techniques. Among the existing microwave breast cancer detection methods, machine learning-type algorithms have recently become more popular. These focus on detecting the existence of breast tumours rather than performing imaging to identify the exact tumour position. A key step of the machine learning approaches is feature extraction. One of the most widely used feature extraction method is principle component analysis (PCA). However, it can be sensitive to signal misalignment. This paper presents an empirical mode decomposition (EMD)-based feature extraction method, which is more robust to the misalignment. Experimental results involving clinical data sets combined with numerically simulated tumour responses show that combined features from EMD and PCA improve the detection performance with an ensemble selection-based classifier.
We consider the problem of multivariate regression in a setting where the relevant predictors could be shared among different responses. We propose an algorithm which decomposes the coefficient matrix into the product of a long matrix and a wide matrix, with an elastic net penalty on the former and an $\ell_1$ penalty on the latter. The first matrix linearly transforms the predictors to a set of latent factors, and the second one regresses the responses on these factors. Our algorithm simultaneously performs dimension reduction and coefficient estimation and automatically estimates the number of latent factors from the data. Our formulation results in a non-convex optimization problem, which despite its flexibility to impose effective low-dimensional structure, is difficult, or even impossible, to solve exactly in a reasonable time. We specify an optimization algorithm based on alternating minimization with three different sets of updates to solve this non-convex problem and provide theoretical results on its convergence and optimality. Finally, we demonstrate the effectiveness of our algorithm via experiments on simulated and real data.
In this paper, we consider a generalized multivariate regression problem where the responses are monotonic functions of linear transformations of predictors. We propose a semi-parametric algorithm based on the ordering of the responses which is invariant to the functional form of the transformation function. We prove that our algorithm, which maximizes the rank correlation of responses and linear transformations of predictors, is a consistent estimator of the true coefficient matrix. We also identify the rate of convergence and show that the squared estimation error decays with a rate of $o(1/\sqrt{n})$. We then propose a greedy algorithm to maximize the highly non-smooth objective function of our model and examine its performance through extensive simulations. Finally, we compare our algorithm with traditional multivariate regression algorithms over synthetic and real data.