Alert button
Picture for Guang Cheng

Guang Cheng

Alert button

Purdue

Optimal tuning for divide-and-conquer kernel ridge regression with massive data

Dec 18, 2016
Ganggang Xu, Zuofeng Shang, Guang Cheng

Figure 1 for Optimal tuning for divide-and-conquer kernel ridge regression with massive data
Figure 2 for Optimal tuning for divide-and-conquer kernel ridge regression with massive data

We propose a first data-driven tuning procedure for divide-and-conquer kernel ridge regression (Zhang et al., 2015). While the proposed criterion is computationally scalable for massive data sets, it is also shown to be asymptotically optimal under mild conditions. The effectiveness of our method is illustrated by extensive simulations and an application to Million Song Dataset.

Viaarxiv icon

Sparse Tensor Graphical Model: Non-convex Optimization and Statistical Inference

Sep 15, 2016
Will Wei Sun, Zhaoran Wang, Xiang Lyu, Han Liu, Guang Cheng

Figure 1 for Sparse Tensor Graphical Model: Non-convex Optimization and Statistical Inference
Figure 2 for Sparse Tensor Graphical Model: Non-convex Optimization and Statistical Inference
Figure 3 for Sparse Tensor Graphical Model: Non-convex Optimization and Statistical Inference
Figure 4 for Sparse Tensor Graphical Model: Non-convex Optimization and Statistical Inference

We consider the estimation and inference of sparse graphical models that characterize the dependency structure of high-dimensional tensor-valued data. To facilitate the estimation of the precision matrix corresponding to each way of the tensor, we assume the data follow a tensor normal distribution whose covariance has a Kronecker product structure. A critical challenge in the estimation and inference of this model is the fact that its penalized maximum likelihood estimation involves minimizing a non-convex objective function. To address it, this paper makes two contributions: (i) In spite of the non-convexity of this estimation problem, we prove that an alternating minimization algorithm, which iteratively estimates each sparse precision matrix while fixing the others, attains an estimator with the optimal statistical rate of convergence. Notably, such an estimator achieves estimation consistency with only one tensor sample, which was not observed in the previous work. (ii) We propose a de-biased statistical inference procedure for testing hypotheses on the true support of the sparse precision matrices, and employ it for testing a growing number of hypothesis with false discovery rate (FDR) control. The asymptotic normality of our test statistic and the consistency of FDR control procedure are established. Our theoretical results are further backed up by thorough numerical studies. We implement the methods into a publicly available R package Tlasso.

* 51 pages 
Viaarxiv icon

Provable Sparse Tensor Decomposition

May 02, 2016
Will Wei Sun, Junwei Lu, Han Liu, Guang Cheng

Figure 1 for Provable Sparse Tensor Decomposition
Figure 2 for Provable Sparse Tensor Decomposition
Figure 3 for Provable Sparse Tensor Decomposition
Figure 4 for Provable Sparse Tensor Decomposition

We propose a novel sparse tensor decomposition method, namely Tensor Truncated Power (TTP) method, that incorporates variable selection into the estimation of decomposition components. The sparsity is achieved via an efficient truncation step embedded in the tensor power iteration. Our method applies to a broad family of high dimensional latent variable models, including high dimensional Gaussian mixture and mixtures of sparse regressions. A thorough theoretical investigation is further conducted. In particular, we show that the final decomposition estimator is guaranteed to achieve a local statistical rate, and further strengthen it to the global statistical rate by introducing a proper initialization procedure. In high dimensional regimes, the obtained statistical rate significantly improves those shown in the existing non-sparse decomposition methods. The empirical advantages of TTP are confirmed in extensive simulated results and two real applications of click-through rate prediction and high-dimensional gene clustering.

* To Appear in JRSS-B 
Viaarxiv icon

Stabilized Nearest Neighbor Classifier and Its Statistical Properties

Aug 30, 2015
Wei Sun, Xingye Qiao, Guang Cheng

Figure 1 for Stabilized Nearest Neighbor Classifier and Its Statistical Properties
Figure 2 for Stabilized Nearest Neighbor Classifier and Its Statistical Properties
Figure 3 for Stabilized Nearest Neighbor Classifier and Its Statistical Properties
Figure 4 for Stabilized Nearest Neighbor Classifier and Its Statistical Properties

The stability of statistical analysis is an important indicator for reproducibility, which is one main principle of scientific method. It entails that similar statistical conclusions can be reached based on independent samples from the same underlying population. In this paper, we introduce a general measure of classification instability (CIS) to quantify the sampling variability of the prediction made by a classification method. Interestingly, the asymptotic CIS of any weighted nearest neighbor classifier turns out to be proportional to the Euclidean norm of its weight vector. Based on this concise form, we propose a stabilized nearest neighbor (SNN) classifier, which distinguishes itself from other nearest neighbor classifiers, by taking the stability into consideration. In theory, we prove that SNN attains the minimax optimal convergence rate in risk, and a sharp convergence rate in CIS. The latter rate result is established for general plug-in classifiers under a low-noise condition. Extensive simulated and real examples demonstrate that SNN achieves a considerable improvement in CIS over existing nearest neighbor classifiers, with comparable classification accuracy. We implement the algorithm in a publicly available R package snn.

* 48 Pages, 11 Figures. To Appear in JASA--T&M 
Viaarxiv icon

Local and global asymptotic inference in smoothing spline models

Nov 26, 2013
Zuofeng Shang, Guang Cheng

Figure 1 for Local and global asymptotic inference in smoothing spline models
Figure 2 for Local and global asymptotic inference in smoothing spline models
Figure 3 for Local and global asymptotic inference in smoothing spline models
Figure 4 for Local and global asymptotic inference in smoothing spline models

This article studies local and global inference for smoothing spline estimation in a unified asymptotic framework. We first introduce a new technical tool called functional Bahadur representation, which significantly generalizes the traditional Bahadur representation in parametric models, that is, Bahadur [Ann. Inst. Statist. Math. 37 (1966) 577-580]. Equipped with this tool, we develop four interconnected procedures for inference: (i) pointwise confidence interval; (ii) local likelihood ratio testing; (iii) simultaneous confidence band; (iv) global likelihood ratio testing. In particular, our confidence intervals are proved to be asymptotically valid at any point in the support, and they are shorter on average than the Bayesian confidence intervals proposed by Wahba [J. R. Stat. Soc. Ser. B Stat. Methodol. 45 (1983) 133-150] and Nychka [J. Amer. Statist. Assoc. 83 (1988) 1134-1143]. We also discuss a version of the Wilks phenomenon arising from local/global likelihood ratio testing. It is also worth noting that our simultaneous confidence bands are the first ones applicable to general quasi-likelihood models. Furthermore, issues relating to optimality and efficiency are carefully addressed. As a by-product, we discover a surprising relationship between periodic and nonperiodic smoothing splines in terms of inference.

* Annals of Statistics 2013, Vol. 41, No. 5, 2608-2638  
* Published in at http://dx.doi.org/10.1214/13-AOS1164 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org) 
Viaarxiv icon