This study examines the time complexities of the unbalanced optimal transport problems from an algorithmic perspective for the first time. We reveal which problems in unbalanced optimal transport can/cannot be solved efficiently. Specifically, we prove that the Kantrovich Rubinstein distance and optimal partial transport in Euclidean metric cannot be computed in strongly subquadratic time under the strong exponential time hypothesis. Then, we propose an algorithm that solves a more general unbalanced optimal transport problem exactly in quasi-linear time on a tree metric. The proposed algorithm processes a tree with one million nodes in less than one second. Our analysis forms a foundation for the theoretical study of unbalanced optimal transport algorithms and opens the door to the applications of unbalanced optimal transport to million-scale datasets.
Graph neural networks (GNNs) are effective machine learning models for various graph learning problems. Despite their empirical successes, the theoretical limitations of GNNs have been revealed recently. Consequently, many GNN models have been proposed to overcome these limitations. In this survey, we provide a comprehensive overview of the expressive power of GNNs and provably powerful variants of GNNs.
Graph neural networks (GNNs) are powerful machine learning models for various graph learning tasks. Recently, the limitations of the expressive power of various GNN models have been revealed. For example, GNNs cannot distinguish some non-isomorphic graphs and they cannot learn efficient graph algorithms, and several GNN models have been proposed to overcome these limitations. In this paper, we demonstrate that GNNs become powerful just by adding a random feature to each node. We prove that the random features enable GNNs to learn almost optimal polynomial-time approximation algorithms for the minimum dominating set problem and maximum matching problem in terms of the approximation ratio. The main advantage of our method is that it can be combined with off-the-shelf GNN models with slight modifications. Through experiments, we show that the addition of random features enables GNNs to solve various problems that normal GNNs, including GCNs and GINs, cannot solve.
The problem of comparing distributions endowed with their own geometry appears in various settings, e.g., when comparing graphs, high-dimensional point clouds, shapes, and generative models. Although the Gromov Wasserstein (GW) distance is usually presented as the natural geometry to handle such comparisons, computing it involves solving a NP-hard problem. In this paper, we propose the Anchor Energy (AE) and Anchor Wasserstein (AW) distances, simpler alternatives to GW built upon the representation of each point in each distribution as the 1D distribution of its distances to all other points. We propose a sweep line algorithm to compute AE \emph{exactly} in $O(n^2 \log n)$ time, where $n$ is the size of each measure, compared to a naive implementation of AE requires $O(n^3)$ efforts. This is quasi-linear w.r.t. the description of the problem itself. AW can be pending a single $O(n^3)$ effort, in addition to the $O(n^2)$ cost of running the Sinkhorn algorithm. We also propose robust versions of AE and AW using rank-based criteria rather than cost values. We show in our experiments that the AE and AW distances perform well in 3D shape comparison and graph matching, at a fraction of the computational cost of popular GW approximations.
In this paper, from a theoretical perspective, we study how powerful graph neural networks (GNNs) can be for learning approximation algorithms for combinatorial problems. To this end, we first establish a new class of GNNs that can solve strictly a wider variety of problems than existing GNNs. Then, we bridge the gap between GNN theory and the theory of distributed local algorithms to theoretically demonstrate that the most powerful GNN can learn approximation algorithms for the minimum dominating set problem and the minimum vertex cover problem with some approximation ratios and that no GNN can perform better than with these ratios. This paper is the first to elucidate approximation ratios of GNNs for combinatorial problems. Furthermore, we prove that adding coloring or weak-coloring to each node feature improves these approximation ratios. This indicates that preprocessing and feature engineering theoretically strengthen model capabilities.
Finding hard instances, which need a long time to solve, of graph problems such as the graph coloring problem and the maximum clique problem, is important for (1) building a good benchmark for evaluating the performance of algorithms, and (2) analyzing the algorithms to accelerate them. The existing methods for generating hard instances rely on parameters or rules that are found by domain experts; however, they are specific to the problem. Hence, it is difficult to generate hard instances for general cases. To address this issue, in this paper, we formulate finding hard instances of graph problems as two equivalent optimization problems. Then, we propose a method to automatically find hard instances by solving the optimization problems. The advantage of the proposed algorithm over the existing rule based approach is that it does not require any task specific knowledge. To the best of our knowledge, this is the first non-trivial method in the literature to automatically find hard instances. Through experiments on various problems, we demonstrate that our proposed method can generate instances that are a few to several orders of magnitude harder than the random based approach in many settings. In particular, our method outperforms rule-based algorithms in the 3-coloring problem.
Recent advancements in graph neural networks (GNN) have led to state-of-the-art performance in various applications including chemo-informatics, question answering systems, and recommendation systems, to name a few. However, making these methods scalable to huge graphs such as web-mining remains a challenge. In particular, the existing methods for accelerating GNN are either not theoretically guaranteed in terms of approximation error or require at least linear time computation cost. In this paper, we propose a constant time approximation algorithm for the inference and training of GNN that theoretically guarantees arbitrary precision with arbitrary probability. The key advantage of the proposed algorithm is that the complexity is completely independent of the number of nodes, edges, and neighbors of the input. To the best of our knowledge, this is the first constant time approximation algorithm for GNN with theoretical guarantee. Through experiments using synthetic and real-world datasets, we evaluate our proposed approximation algorithm and show that the algorithm can successfully approximate GNN in constant time.