Directed acyclic graph (DAG) has been widely employed to represent directional relationships among a set of collected nodes. Yet, the available data in one single study is often limited for accurate DAG reconstruction, whereas heterogeneous data may be collected from multiple relevant studies. It remains an open question how to pool the heterogeneous data together for better DAG structure reconstruction in the target study. In this paper, we first introduce a novel set of structural similarity measures for DAG and then present a transfer DAG learning framework by effectively leveraging information from auxiliary DAGs of different levels of similarities. Our theoretical analysis shows substantial improvement in terms of DAG reconstruction in the target study, even when no auxiliary DAG is overall similar to the target DAG, which is in sharp contrast to most existing transfer learning methods. The advantage of the proposed transfer DAG learning is also supported by extensive numerical experiments on both synthetic data and multi-site brain functional connectivity network data.
We propose a general approach to evaluating the performance of robust estimators based on adversarial losses under misspecified models. We first show that adversarial risk is equivalent to the risk induced by a distributional adversarial attack under certain smoothness conditions. This ensures that the adversarial training procedure is well-defined. To evaluate the generalization performance of the adversarial estimator, we study the adversarial excess risk. Our proposed analysis method includes investigations on both generalization error and approximation error. We then establish non-asymptotic upper bounds for the adversarial excess risk associated with Lipschitz loss functions. In addition, we apply our general results to adversarial training for classification and regression problems. For the quadratic loss in nonparametric regression, we show that the adversarial excess risk bound can be improved over those for a general loss.
Tensor Gaussian graphical models (GGMs), interpreting conditional independence structures within tensor data, have important applications in numerous areas. Yet, the available tensor data in one single study is often limited due to high acquisition costs. Although relevant studies can provide additional data, it remains an open question how to pool such heterogeneous data. In this paper, we propose a transfer learning framework for tensor GGMs, which takes full advantage of informative auxiliary domains even when non-informative auxiliary domains are present, benefiting from the carefully designed data-adaptive weights. Our theoretical analysis shows substantial improvement of estimation errors and variable selection consistency on the target domain under much relaxed conditions, by leveraging information from auxiliary domains. Extensive numerical experiments are conducted on both synthetic tensor graphs and a brain functional connectivity network data, which demonstrates the satisfactory performance of the proposed method.
Temporal network has become ubiquitous with the rise of online social platform and e-commerce, but largely under investigated in literature. In this paper, we propose a statistical framework for temporal network analysis, leveraging strengths of adaptive network merging, tensor decomposition and point process. A two-step embedding procedure and a regularized maximum likelihood estimate based on Poisson point process is developed, where the initial estimate is based on equal spaced time intervals while the final estimate on the adaptively merging time intervals. A projected gradient descent algorithm is proposed to facilitate estimation, where the upper bound of the tensor estimation error in each iteration is established. Through analysis, it is shown that the tensor estimation error is significantly reduced by the proposed method. Extensive numerical experiments also validate this phenomenon, as well as its advantage over other existing competitors. The proposed method is also applied to analyze a militarized interstate dispute dataset, where not only the prediction accuracy increases, but the adaptively merged intervals also lead to clear interpretation.
Signed networks are frequently observed in real life with additional sign information associated with each edge, yet such information has been largely ignored in existing network models. This paper develops a unified embedding model for signed networks to disentangle the intertwined balance structure and anomaly effect, which can greatly facilitate the downstream analysis, including community detection, anomaly detection, and network inference. The proposed model captures both balance structure and anomaly effect through a low rank plus sparse matrix decomposition, which are jointly estimated via a regularized formulation. Its theoretical guarantees are established in terms of asymptotic consistency and finite-sample probability bounds for network embedding, community detection and anomaly detection. The advantage of the proposed embedding model is also demonstrated through extensive numerical experiments on both synthetic networks and an international relation network.
Directed acyclic graph (DAG) models are widely used to represent causal relationships among random variables in many application domains. This paper studies a special class of non-Gaussian DAG models, where the conditional variance of each node given its parents is a quadratic function of its conditional mean. Such a class of non-Gaussian DAG models are fairly flexible and admit many popular distributions as special cases, including Poisson, Binomial, Geometric, Exponential, and Gamma. To facilitate learning, we introduce a novel concept of topological layers, and develop an efficient DAG learning algorithm. It first reconstructs the topological layers in a hierarchical fashion and then recoveries the directed edges between nodes in different layers, which requires much less computational cost than most existing algorithms in literature. Its advantage is also demonstrated in a number of simulated examples, as well as its applications to two real-life datasets, including an NBA player statistics data and a cosmetic sales data collected by Alibaba.
Acyclic model, often depicted as a directed acyclic graph (DAG), has been widely employed to represent directional causal relations among collected nodes. In this article, we propose an efficient method to learn linear non-Gaussian DAG in high dimensional cases, where the noises can be of any continuous non-Gaussian distribution. This is in sharp contrast to most existing DAG learning methods assuming Gaussian noise with additional variance assumptions to attain exact DAG recovery. The proposed method leverages a novel concept of topological layer to facilitate the DAG learning. Particularly, we show that the topological layers can be exactly reconstructed in a bottom-up fashion, and the parent-child relations among nodes in each layer can also be consistently established. More importantly, the proposed method does not require the faithfulness or parental faithfulness assumption which has been widely assumed in the literature of DAG learning. Its advantage is also supported by the numerical comparison against some popular competitors in various simulated examples as well as a real application on the global spread of COVID-19.
This paper considers the partially functional linear model (PFLM) where all predictive features consist of a functional covariate and a high dimensional scalar vector. Over an infinite dimensional reproducing kernel Hilbert space, the proposed estimation for PFLM is a least square approach with two mixed regularizations of a function-norm and an $\ell_1$-norm. Our main task in this paper is to establish the minimax rates for PFLM under high dimensional setting, and the optimal minimax rates of estimation is established by using various techniques in empirical process theory for analyzing kernel classes. In addition, we propose an efficient numerical algorithm based on randomized sketches of the kernel matrix. Several numerical experiments are implemented to support our method and optimization strategy.
Network data has attracted tremendous attention in recent years, and most conventional networks focus on pairwise interactions between two vertices. However, real-life network data may display more complex structures, and multi-way interactions among vertices arise naturally. In this article, we propose a novel method for detecting community structure in general hypergraph networks, uniform or non-uniform. The proposed method introduces a null vertex to augment a non-uniform hypergraph into a uniform multi-hypergraph, and then embeds the multi-hypergraph in a low-dimensional vector space such that vertices within the same community are close to each other. The resultant optimization task can be efficiently tackled by an alternative updating scheme. The asymptotic consistencies of the proposed method are established in terms of both community detection and hypergraph estimation, which are also supported by numerical experiments on some synthetic and real-life hypergraph networks.
Sparse learning aims to learn the sparse structure of the true target function from the collected data, which plays a crucial role in high dimensional data analysis. This article proposes a unified and universal method for learning sparsity of M-estimators within a rich family of loss functions in a reproducing kernel Hilbert space (RKHS). The family of loss functions interested is very rich, including most commonly used ones in literature. More importantly, the proposed method is motivated by some nice properties in the induced RKHS, and is computationally efficient for large-scale data, and can be further improved through parallel computing. The asymptotic estimation and selection consistencies of the proposed method are established for a general loss function under mild conditions. It works for general loss function, admits general dependence structure, allows for efficient computation, and with theoretical guarantee. The superior performance of our proposed method is also supported by a variety of simulated examples and a real application in the human breast cancer study (GSE20194).