Alert button
Picture for John Lafferty

John Lafferty

Alert button

University of Chicago

Nonparametric Reduced Rank Regression

Jan 09, 2013
Rina Foygel, Michael Horrell, Mathias Drton, John Lafferty

Figure 1 for Nonparametric Reduced Rank Regression
Figure 2 for Nonparametric Reduced Rank Regression

We propose an approach to multivariate nonparametric regression that generalizes reduced rank regression for linear models. An additive model is estimated for each dimension of a $q$-dimensional response, with a shared $p$-dimensional predictor variable. To control the complexity of the model, we employ a functional form of the Ky-Fan or nuclear norm, resulting in a set of function estimates that have low rank. Backfitting algorithms are derived and justified using a nonparametric form of the nuclear norm subdifferential. Oracle inequalities on excess risk are derived that exhibit the scaling behavior of the procedure in the high dimensional setting. The methods are illustrated on gene expression data.

Viaarxiv icon

Sparse Nonparametric Graphical Models

Jan 07, 2013
John Lafferty, Han Liu, Larry Wasserman

Figure 1 for Sparse Nonparametric Graphical Models
Figure 2 for Sparse Nonparametric Graphical Models
Figure 3 for Sparse Nonparametric Graphical Models
Figure 4 for Sparse Nonparametric Graphical Models

We present some nonparametric methods for graphical modeling. In the discrete case, where the data are binary or drawn from a finite alphabet, Markov random fields are already essentially nonparametric, since the cliques can take only a finite number of values. Continuous data are different. The Gaussian graphical model is the standard parametric model for continuous data, but it makes distributional assumptions that are often unrealistic. We discuss two approaches to building more flexible graphical models. One allows arbitrary graphs and a nonparametric extension of the Gaussian; the other uses kernel density estimation and restricts the graphs to trees and forests. Examples of both methods are presented. We also discuss possible future research directions for nonparametric graphical modeling.

* Statistical Science 2012, Vol. 27, No. 4, 519-537  
* Published in at http://dx.doi.org/10.1214/12-STS391 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org) 
Viaarxiv icon

Expectation-Propogation for the Generative Aspect Model

Dec 12, 2012
Thomas P. Minka, John Lafferty

Figure 1 for Expectation-Propogation for the Generative Aspect Model
Figure 2 for Expectation-Propogation for the Generative Aspect Model
Figure 3 for Expectation-Propogation for the Generative Aspect Model

The generative aspect model is an extension of the multinomial model for text that allows word probabilities to vary stochastically across documents. Previous results with aspect models have been promising, but hindered by the computational difficulty of carrying out inference and learning. This paper demonstrates that the simple variational methods of Blei et al (2001) can lead to inaccurate inferences and biased learning for the generative aspect model. We develop an alternative approach that leads to higher accuracy at comparable cost. An extension of Expectation-Propagation is used for inference and then embedded in an EM algorithm for learning. Experimental results are presented for both synthetic and real data sets.

* Appears in Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002) 
Viaarxiv icon

High Dimensional Semiparametric Gaussian Copula Graphical Models

Jul 27, 2012
Han Liu, Fang Han, Ming Yuan, John Lafferty, Larry Wasserman

Figure 1 for High Dimensional Semiparametric Gaussian Copula Graphical Models
Figure 2 for High Dimensional Semiparametric Gaussian Copula Graphical Models
Figure 3 for High Dimensional Semiparametric Gaussian Copula Graphical Models
Figure 4 for High Dimensional Semiparametric Gaussian Copula Graphical Models

In this paper, we propose a semiparametric approach, named nonparanormal skeptic, for efficiently and robustly estimating high dimensional undirected graphical models. To achieve modeling flexibility, we consider Gaussian Copula graphical models (or the nonparanormal) as proposed by Liu et al. (2009). To achieve estimation robustness, we exploit nonparametric rank-based correlation coefficient estimators, including Spearman's rho and Kendall's tau. In high dimensional settings, we prove that the nonparanormal skeptic achieves the optimal parametric rate of convergence in both graph and parameter estimation. This celebrating result suggests that the Gaussian copula graphical models can be used as a safe replacement of the popular Gaussian graphical models, even when the data are truly Gaussian. Besides theoretical analysis, we also conduct thorough numerical simulations to compare different estimators for their graph recovery performance under both ideal and noisy settings. The proposed methods are then applied on a large-scale genomic dataset to illustrate their empirical usefulness. The R language software package huge implementing the proposed methods is available on the Comprehensive R Archive Network: http://cran. r-project.org/.

* 34 pages, 10 figures; the Annals of Statistics, 2012 
Viaarxiv icon

Variational Chernoff Bounds for Graphical Models

Jul 11, 2012
Pradeep Ravikumar, John Lafferty

Figure 1 for Variational Chernoff Bounds for Graphical Models
Figure 2 for Variational Chernoff Bounds for Graphical Models
Figure 3 for Variational Chernoff Bounds for Graphical Models

Recent research has made significant progress on the problem of bounding log partition functions for exponential family graphical models. Such bounds have associated dual parameters that are often used as heuristic estimates of the marginal probabilities required in inference and learning. However these variational estimates do not give rigorous bounds on marginal probabilities, nor do they give estimates for probabilities of more general events than simple marginals. In this paper we build on this recent work by deriving rigorous upper and lower bounds on event probabilities for graphical models. Our approach is based on the use of generalized Chernoff bounds to express bounds on event probabilities in terms of convex optimization problems; these optimization problems, in turn, require estimates of generalized log partition functions. Simulations indicate that this technique can result in useful, rigorous bounds to complement the heuristic variational estimates, with comparable computational cost.

* Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004) 
Viaarxiv icon

The Nonparanormal SKEPTIC

Jun 27, 2012
Han Liu, Fang Han, Ming Yuan, John Lafferty, Larry Wasserman

Figure 1 for The Nonparanormal SKEPTIC
Figure 2 for The Nonparanormal SKEPTIC
Figure 3 for The Nonparanormal SKEPTIC
Figure 4 for The Nonparanormal SKEPTIC

We propose a semiparametric approach, named nonparanormal skeptic, for estimating high dimensional undirected graphical models. In terms of modeling, we consider the nonparanormal family proposed by Liu et al (2009). In terms of estimation, we exploit nonparametric rank-based correlation coefficient estimators including the Spearman's rho and Kendall's tau. In high dimensional settings, we prove that the nonparanormal skeptic achieves the optimal parametric rate of convergence in both graph and parameter estimation. This result suggests that the nonparanormal graphical models are a safe replacement of the Gaussian graphical models, even when the data are Gaussian.

* Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012) 
Viaarxiv icon

Sequential Nonparametric Regression

Jun 27, 2012
Haijie Gu, John Lafferty

Figure 1 for Sequential Nonparametric Regression
Figure 2 for Sequential Nonparametric Regression

We present algorithms for nonparametric regression in settings where the data are obtained sequentially. While traditional estimators select bandwidths that depend upon the sample size, for sequential data the effective sample size is dynamically changing. We propose a linear time algorithm that adjusts the bandwidth for each new data point, and show that the estimator achieves the optimal minimax rate of convergence. We also propose the use of online expert mixing algorithms to adapt to unknown smoothness of the regression function. We provide simulations that confirm the theoretical results, and demonstrate the effectiveness of the methods.

* Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012) 
Viaarxiv icon

Conditional Sparse Coding and Grouped Multivariate Regression

Jun 27, 2012
Min Xu, John Lafferty

Figure 1 for Conditional Sparse Coding and Grouped Multivariate Regression

We study the problem of multivariate regression where the data are naturally grouped, and a regression matrix is to be estimated for each group. We propose an approach in which a dictionary of low rank parameter matrices is estimated across groups, and a sparse linear combination of the dictionary elements is estimated to form a model within each group. We refer to the method as conditional sparse coding since it is a coding procedure for the response vectors Y conditioned on the covariate vectors X. This approach captures the shared information across the groups while adapting to the structure within each group. It exploits the same intuition behind sparse coding that has been successfully developed in computer vision and computational neuroscience. We propose an algorithm for conditional sparse coding, analyze its theoretical properties in terms of predictive accuracy, and present the results of simulation and brain imaging experiments that compare the new technique to reduced rank regression.

* Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012) 
Viaarxiv icon

Sparse Additive Functional and Kernel CCA

Jun 18, 2012
Sivaraman Balakrishnan, Kriti Puniyani, John Lafferty

Figure 1 for Sparse Additive Functional and Kernel CCA
Figure 2 for Sparse Additive Functional and Kernel CCA
Figure 3 for Sparse Additive Functional and Kernel CCA
Figure 4 for Sparse Additive Functional and Kernel CCA

Canonical Correlation Analysis (CCA) is a classical tool for finding correlations among the components of two random vectors. In recent years, CCA has been widely applied to the analysis of genomic data, where it is common for researchers to perform multiple assays on a single set of patient samples. Recent work has proposed sparse variants of CCA to address the high dimensionality of such data. However, classical and sparse CCA are based on linear models, and are thus limited in their ability to find general correlations. In this paper, we present two approaches to high-dimensional nonparametric CCA, building on recent developments in high-dimensional nonparametric regression. We present estimation procedures for both approaches, and analyze their theoretical properties in the high-dimensional setting. We demonstrate the effectiveness of these procedures in discovering nonlinear correlations via extensive simulations, as well as through experiments with genomic data.

* ICML2012 
Viaarxiv icon

Forest Density Estimation

Oct 20, 2010
Han Liu, Min Xu, Haijie Gu, Anupam Gupta, John Lafferty, Larry Wasserman

Figure 1 for Forest Density Estimation
Figure 2 for Forest Density Estimation
Figure 3 for Forest Density Estimation
Figure 4 for Forest Density Estimation

We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal's algorithm to estimate the optimal forest on held out data. We prove an oracle inequality on the excess risk of the resulting estimator relative to the risk of the best forest. For graph estimation, we consider the problem of estimating forests with restricted tree sizes. We prove that finding a maximum weight spanning forest with restricted tree size is NP-hard, and develop an approximation algorithm for this problem. Viewing the tree size as a complexity parameter, we then select a forest using data splitting, and prove bounds on excess risk and structure selection consistency of the procedure. Experiments with simulated data and microarray data indicate that the methods are a practical alternative to Gaussian graphical models.

* Extended version of earlier paper titled "Tree density estimation" 
Viaarxiv icon