Understanding nonlinear dynamical systems (NLDSs) is challenging in a variety of engineering and scientific fields. Dynamic mode decomposition (DMD), which is a numerical algorithm for the spectral analysis of Koopman operators, has been attracting attention as a way of obtaining global modal descriptions of NLDSs without requiring explicit prior knowledge. However, since existing DMD algorithms are in principle formulated based on the concatenation of scalar observables, it is not directly applicable to data with dependent structures among observables, which take, for example, the form of a sequence of graphs. In this paper, we formulate Koopman spectral analysis for NLDSs with structures among observables and propose an estimation algorithm for this problem. This method can extract and visualize the underlying low-dimensional global dynamics of NLDSs with structures among observables from data, which can be useful in understanding the underlying dynamics of such NLDSs. To this end, we first formulate the problem of estimating spectra of the Koopman operator defined in vector-valued reproducing kernel Hilbert spaces, and then develop an estimation procedure for this problem by reformulating tensor-based DMD. As a special case of our method, we propose the method named as Graph DMD, which is a numerical algorithm for Koopman spectral analysis of graph dynamical systems, using a sequence of adjacency matrices. We investigate the empirical performance of our method by using synthetic and real-world data.
Spectral decomposition of the Koopman operator is attracting attention as a tool for the analysis of nonlinear dynamical systems. Dynamic mode decomposition is a popular numerical algorithm for Koopman spectral analysis; however, we often need to prepare nonlinear observables manually according to the underlying dynamics, which is not always possible since we may not have any a priori knowledge about them. In this paper, we propose a fully data-driven method for Koopman spectral analysis based on the principle of learning Koopman invariant subspaces from observed data. To this end, we propose minimization of the residual sum of squares of linear least-squares regression to estimate a set of functions that transforms data into a form in which the linear regression fits well. We introduce an implementation with neural networks and evaluate performance empirically using nonlinear dynamical systems and applications.
The proximal problem for structured penalties obtained via convex relaxations of submodular functions is known to be equivalent to minimizing separable convex functions over the corresponding submodular polyhedra. In this paper, we reveal a comprehensive class of structured penalties for which penalties this problem can be solved via an efficiently solvable class of parametric maxflow optimization. We then show that the parametric maxflow algorithm proposed by Gallo et al. and its variants, which runs, in the worst-case, at the cost of only a constant factor of a single computation of the corresponding maxflow optimization, can be adapted to solve the proximal problems for those penalties. Several existing structured penalties satisfy these conditions; thus, regularized learning with these penalties is solvable quickly using the parametric maxflow algorithm. We also investigate the empirical runtime performance of the proposed framework.
Structural equation models and Bayesian networks have been widely used to analyze causal relations between continuous variables. In such frameworks, linear acyclic models are typically used to model the datagenerating process of variables. Recently, it was shown that use of non-Gaussianity identifies a causal ordering of variables in a linear acyclic model without using any prior knowledge on the network structure, which is not the case with conventional methods. However, existing estimation methods are based on iterative search algorithms and may not converge to a correct solution in a finite number of steps. In this paper, we propose a new direct method to estimate a causal ordering based on non-Gaussianity. In contrast to the previous methods, our algorithm requires no algorithmic parameters and is guaranteed to converge to the right solution within a small fixed number of steps if the data strictly follows the model.
Discovering causal relations among observed variables in a given data set is a major objective in studies of statistics and artificial intelligence. Recently, some techniques to discover a unique causal model have been explored based on non-Gaussianity of the observed data distribution. However, most of these are limited to continuous data. In this paper, we present a novel causal model for binary data and propose an efficient new approach to deriving the unique causal model governing a given binary data set under skew distributions of external binary noises. Experimental evaluation shows excellent performance for both artificial and real world data sets.
A number of discrete and continuous optimization problems in machine learning are related to convex minimization problems under submodular constraints. In this paper, we deal with a submodular function with a directed graph structure, and we show that a wide range of convex optimization problems under submodular constraints can be solved much more efficiently than general submodular optimization methods by a reduction to a maximum flow problem. Furthermore, we give some applications, including sparse optimization methods, in which the proposed methods are effective. Additionally, we evaluate the performance of the proposed method through computational experiments.
As an increasing number of genome-wide association studies reveal the limitations of attempting to explain phenotypic heritability by single genetic loci, there is growing interest for associating complex phenotypes with sets of genetic loci. While several methods for multi-locus mapping have been proposed, it is often unclear how to relate the detected loci to the growing knowledge about gene pathways and networks. The few methods that take biological pathways or networks into account are either restricted to investigating a limited number of predetermined sets of loci, or do not scale to genome-wide settings. We present SConES, a new efficient method to discover sets of genetic loci that are maximally associated with a phenotype, while being connected in an underlying network. Our approach is based on a minimum cut reformulation of the problem of selecting features under sparsity and connectivity constraints that can be solved exactly and rapidly. SConES outperforms state-of-the-art competitors in terms of runtime, scales to hundreds of thousands of genetic loci, and exhibits higher power in detecting causal SNPs in simulation studies than existing methods. On flowering time phenotypes and genotypes from Arabidopsis thaliana, SConES detects loci that enable accurate phenotype prediction and that are supported by the literature. Matlab code for SConES is available at http://webdav.tuebingen.mpg.de/u/karsten/Forschung/scones/
Discovering causal relations among observed variables in a given data set is a main topic in studies of statistics and artificial intelligence. Recently, some techniques to discover an identifiable causal structure have been explored based on non-Gaussianity of the observed data distribution. However, most of these are limited to continuous data. In this paper, we present a novel causal model for binary data and propose a new approach to derive an identifiable causal structure governing the data based on skew Bernoulli distributions of external noise. Experimental evaluation shows excellent performance for both artificial and real world data sets.
Structural equation models and Bayesian networks have been widely used to analyze causal relations between continuous variables. In such frameworks, linear acyclic models are typically used to model the data-generating process of variables. Recently, it was shown that use of non-Gaussianity identifies the full structure of a linear acyclic model, i.e., a causal ordering of variables and their connection strengths, without using any prior knowledge on the network structure, which is not the case with conventional methods. However, existing estimation methods are based on iterative search algorithms and may not converge to a correct solution in a finite number of steps. In this paper, we propose a new direct method to estimate a causal ordering and connection strengths based on non-Gaussianity. In contrast to the previous methods, our algorithm requires no algorithmic parameters and is guaranteed to converge to the right solution within a small fixed number of steps if the data strictly follows the model.
Finding the structure of a graphical model has been received much attention in many fields. Recently, it is reported that the non-Gaussianity of data enables us to identify the structure of a directed acyclic graph without any prior knowledge on the structure. In this paper, we propose a novel non-Gaussianity based algorithm for more general type of models; chain graphs. The algorithm finds an ordering of the disjoint subsets of variables by iteratively evaluating the independence between the variable subset and the residuals when the remaining variables are regressed on those. However, its computational cost grows exponentially according to the number of variables. Therefore, we further discuss an efficient approximate approach for applying the algorithm to large sized graphs. We illustrate the algorithm with artificial and real-world datasets.