Alert button
Picture for Ming Yuan

Ming Yuan

Alert button

Multiple Testing of Linear Forms for Noisy Matrix Completion

Dec 01, 2023
Wanteng Ma, Lilun Du, Dong Xia, Ming Yuan

Many important tasks of large-scale recommender systems can be naturally cast as testing multiple linear forms for noisy matrix completion. These problems, however, present unique challenges because of the subtle bias-and-variance tradeoff of and an intricate dependence among the estimated entries induced by the low-rank structure. In this paper, we develop a general approach to overcome these difficulties by introducing new statistics for individual tests with sharp asymptotics both marginally and jointly, and utilizing them to control the false discovery rate (FDR) via a data splitting and symmetric aggregation scheme. We show that valid FDR control can be achieved with guaranteed power under nearly optimal sample size requirements using the proposed methodology. Extensive numerical simulations and real data examples are also presented to further illustrate its practical merits.

Viaarxiv icon

Mode-wise Principal Subspace Pursuit and Matrix Spiked Covariance Model

Jul 02, 2023
Runshi Tang, Ming Yuan, Anru R. Zhang

Figure 1 for Mode-wise Principal Subspace Pursuit and Matrix Spiked Covariance Model
Figure 2 for Mode-wise Principal Subspace Pursuit and Matrix Spiked Covariance Model
Figure 3 for Mode-wise Principal Subspace Pursuit and Matrix Spiked Covariance Model
Figure 4 for Mode-wise Principal Subspace Pursuit and Matrix Spiked Covariance Model

This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a class of matrix-variate spiked covariance models that serve as inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm consists of two steps: Average Subspace Capture (ASC) and Alternating Projection (AP). These steps are specifically designed to capture the row-wise and column-wise dimension-reduced subspaces which contain the most informative features of the data. ASC utilizes a novel average projection operator as initialization and achieves exact recovery in the noiseless setting. We analyze the convergence and non-asymptotic error bounds of MOP-UP, introducing a blockwise matrix eigenvalue perturbation bound that proves the desired bound, where classic perturbation bounds fail. The effectiveness and practical merits of the proposed framework are demonstrated through experiments on both simulated and real datasets. Lastly, we discuss generalizations of our approach to higher-order data.

Viaarxiv icon

Large Dimensional Independent Component Analysis: Statistical Optimality and Computational Tractability

Mar 31, 2023
Arnab Auddy, Ming Yuan

Figure 1 for Large Dimensional Independent Component Analysis: Statistical Optimality and Computational Tractability
Figure 2 for Large Dimensional Independent Component Analysis: Statistical Optimality and Computational Tractability
Figure 3 for Large Dimensional Independent Component Analysis: Statistical Optimality and Computational Tractability
Figure 4 for Large Dimensional Independent Component Analysis: Statistical Optimality and Computational Tractability

In this paper, we investigate the optimal statistical performance and the impact of computational constraints for independent component analysis (ICA). Our goal is twofold. On the one hand, we characterize the precise role of dimensionality on sample complexity and statistical accuracy, and how computational consideration may affect them. In particular, we show that the optimal sample complexity is linear in dimensionality, and interestingly, the commonly used sample kurtosis-based approaches are necessarily suboptimal. However, the optimal sample complexity becomes quadratic, up to a logarithmic factor, in the dimension if we restrict ourselves to estimates that can be computed with low-degree polynomial algorithms. On the other hand, we develop computationally tractable estimates that attain both the optimal sample complexity and minimax optimal rates of convergence. We study the asymptotic properties of the proposed estimates and establish their asymptotic normality that can be readily used for statistical inferences. Our method is fairly easy to implement and numerical experiments are presented to further demonstrate its practical merits.

Viaarxiv icon

On Recovering the Best Rank-r Approximation from Few Entries

Nov 11, 2021
Shun Xu, Ming Yuan

Figure 1 for On Recovering the Best Rank-r Approximation from Few Entries
Figure 2 for On Recovering the Best Rank-r Approximation from Few Entries
Figure 3 for On Recovering the Best Rank-r Approximation from Few Entries
Figure 4 for On Recovering the Best Rank-r Approximation from Few Entries

In this note, we investigate how well we can reconstruct the best rank-$r$ approximation of a large matrix from a small number of its entries. We show that even if a data matrix is of full rank and cannot be approximated well by a low-rank matrix, its best low-rank approximations may still be reliably computed or estimated from a small number of its entries. This is especially relevant from a statistical viewpoint: the best low-rank approximations to a data matrix are often of more interest than itself because they capture the more stable and oftentimes more reproducible properties of an otherwise complicated data-generating model. In particular, we investigate two agnostic approaches: the first is based on spectral truncation; and the second is a projected gradient descent based optimization procedure. We argue that, while the first approach is intuitive and reasonably effective, the latter has far superior performance in general. We show that the error depends on how close the matrix is to being of low rank. Both theoretical and numerical evidence is presented to demonstrate the effectiveness of the proposed approaches.

Viaarxiv icon

On Estimating Rank-One Spiked Tensors in the Presence of Heavy Tailed Errors

Jul 20, 2021
Arnab Auddy, Ming Yuan

Figure 1 for On Estimating Rank-One Spiked Tensors in the Presence of Heavy Tailed Errors
Figure 2 for On Estimating Rank-One Spiked Tensors in the Presence of Heavy Tailed Errors
Figure 3 for On Estimating Rank-One Spiked Tensors in the Presence of Heavy Tailed Errors
Figure 4 for On Estimating Rank-One Spiked Tensors in the Presence of Heavy Tailed Errors

In this paper, we study the estimation of a rank-one spiked tensor in the presence of heavy tailed noise. Our results highlight some of the fundamental similarities and differences in the tradeoff between statistical and computational efficiencies under heavy tailed and Gaussian noise. In particular, we show that, for $p$ th order tensors, the tradeoff manifests in an identical fashion as the Gaussian case when the noise has finite $4(p-1)$ th moment. The difference in signal strength requirements, with or without computational constraints, for us to estimate the singular vectors at the optimal rate, interestingly, narrows for noise with heavier tails and vanishes when the noise only has finite fourth moment. Moreover, if the noise has less than fourth moment, tensor SVD, perhaps the most natural approach, is suboptimal even though it is computationally intractable. Our analysis exploits a close connection between estimating the rank-one spikes and the spectral norm of a random tensor with iid entries. In particular, we show that the order of the spectral norm of a random tensor can be precisely characterized by the moment of its entries, generalizing classical results for random matrices. In addition to the theoretical guarantees, we propose estimation procedures for the heavy tailed regime, which are easy to implement and efficient to run. Numerical experiments are presented to demonstrate their practical merits.

* 46 pages, 7 figures 
Viaarxiv icon

Assessing Fairness in Classification Parity of Machine Learning Models in Healthcare

Feb 07, 2021
Ming Yuan, Vikas Kumar, Muhammad Aurangzeb Ahmad, Ankur Teredesai

Figure 1 for Assessing Fairness in Classification Parity of Machine Learning Models in Healthcare
Figure 2 for Assessing Fairness in Classification Parity of Machine Learning Models in Healthcare
Figure 3 for Assessing Fairness in Classification Parity of Machine Learning Models in Healthcare
Figure 4 for Assessing Fairness in Classification Parity of Machine Learning Models in Healthcare

Fairness in AI and machine learning systems has become a fundamental problem in the accountability of AI systems. While the need for accountability of AI models is near ubiquitous, healthcare in particular is a challenging field where accountability of such systems takes upon additional importance, as decisions in healthcare can have life altering consequences. In this paper we present preliminary results on fairness in the context of classification parity in healthcare. We also present some exploratory methods to improve fairness and choosing appropriate classification algorithms in the context of healthcare.

Viaarxiv icon

A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration

Aug 06, 2020
Yuetian Luo, Garvesh Raskutti, Ming Yuan, Anru R. Zhang

Figure 1 for A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration
Figure 2 for A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration
Figure 3 for A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration
Figure 4 for A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration

In this paper, we develop novel perturbation bounds for the high-order orthogonal iteration (HOOI) [DLDMV00b]. Under mild regularity conditions, we establish blockwise tensor perturbation bounds for HOOI with guarantees for both tensor reconstruction in Hilbert-Schmidt norm $\|\widehat{\bcT} - \bcT \|_{\tHS}$ and mode-$k$ singular subspace estimation in Schatten-$q$ norm $\| \sin \Theta (\widehat{\U}_k, \U_k) \|_q$ for any $q \geq 1$. We show the upper bounds of mode-$k$ singular subspace estimation are unilateral and converge linearly to a quantity characterized by blockwise errors of the perturbation and signal strength. For the tensor reconstruction error bound, we express the bound through a simple quantity $\xi$, which depends only on perturbation and the multilinear rank of the underlying signal. Rate matching deterministic lower bound for tensor reconstruction, which demonstrates the optimality of HOOI, is also provided. Furthermore, we prove that one-step HOOI (i.e., HOOI with only a single iteration) is also optimal in terms of tensor reconstruction and can be used to lower the computational cost. The perturbation results are also extended to the case that only partial modes of $\bcT$ have low-rank structure. We support our theoretical results by extensive numerical studies. Finally, we apply the novel perturbation bounds of HOOI on two applications, tensor denoising and tensor co-clustering, from machine learning and statistics, which demonstrates the superiority of the new perturbation results.

Viaarxiv icon

Perturbation Bounds for Orthogonally Decomposable Tensors and Their Applications in High Dimensional Data Analysis

Jul 17, 2020
Arnab Auddy, Ming Yuan

Figure 1 for Perturbation Bounds for Orthogonally Decomposable Tensors and Their Applications in High Dimensional Data Analysis
Figure 2 for Perturbation Bounds for Orthogonally Decomposable Tensors and Their Applications in High Dimensional Data Analysis

We develop deterministic perturbation bounds for singular values and vectors of orthogonally decomposable tensors, in a spirit similar to classical results for matrices. Our bounds exhibit intriguing differences between matrices and higher-order tensors. Most notably, they indicate that for higher-order tensors perturbation affects each singular value/vector in isolation. In particular, its effect on a singular vector does not depend on the multiplicity of its corresponding singular value or its distance from other singular values. Our results can be readily applied and provide a unified treatment to many different problems involving higher-order orthogonally decomposable tensors. In particular, we illustrate the implications of our bounds through three connected yet seemingly different high dimensional data analysis tasks: tensor SVD, tensor regression and estimation of latent variable models, leading to new insights in each of these settings.

Viaarxiv icon

ISLET: Fast and Optimal Low-rank Tensor Regression via Importance Sketching

Nov 09, 2019
Anru Zhang, Yuetian Luo, Garvesh Raskutti, Ming Yuan

Figure 1 for ISLET: Fast and Optimal Low-rank Tensor Regression via Importance Sketching
Figure 2 for ISLET: Fast and Optimal Low-rank Tensor Regression via Importance Sketching
Figure 3 for ISLET: Fast and Optimal Low-rank Tensor Regression via Importance Sketching
Figure 4 for ISLET: Fast and Optimal Low-rank Tensor Regression via Importance Sketching

In this paper, we develop a novel procedure for low-rank tensor regression, namely \underline{I}mportance \underline{S}ketching \underline{L}ow-rank \underline{E}stimation for \underline{T}ensors (ISLET). The central idea behind ISLET is \emph{importance sketching}, i.e., carefully designed sketches based on both the responses and low-dimensional structure of the parameter of interest. We show that the proposed method is sharply minimax optimal in terms of the mean-squared error under low-rank Tucker assumptions and under randomized Gaussian ensemble design. In addition, if a tensor is low-rank with group sparsity, our procedure also achieves minimax optimality. Further, we show through numerical studies that ISLET achieves comparable or better mean-squared error performance to existing state-of-the-art methods whilst having substantial storage and run-time advantages including capabilities for parallel and distributed computing. In particular, our procedure performs reliable estimation with tensors of dimension $p = O(10^8)$ and is $1$ or $2$ orders of magnitude faster than baseline methods.

Viaarxiv icon

On the Optimality of Gaussian Kernel Based Nonparametric Tests against Smooth Alternatives

Sep 07, 2019
Tong Li, Ming Yuan

Figure 1 for On the Optimality of Gaussian Kernel Based Nonparametric Tests against Smooth Alternatives
Figure 2 for On the Optimality of Gaussian Kernel Based Nonparametric Tests against Smooth Alternatives
Figure 3 for On the Optimality of Gaussian Kernel Based Nonparametric Tests against Smooth Alternatives
Figure 4 for On the Optimality of Gaussian Kernel Based Nonparametric Tests against Smooth Alternatives

Nonparametric tests via kernel embedding of distributions have witnessed a great deal of practical successes in recent years. However, statistical properties of these tests are largely unknown beyond consistency against a fixed alternative. To fill in this void, we study here the asymptotic properties of goodness-of-fit, homogeneity and independence tests using Gaussian kernels, arguably the most popular and successful among such tests. Our results provide theoretical justifications for this common practice by showing that tests using Gaussian kernel with an appropriately chosen scaling parameter are minimax optimal against smooth alternatives in all three settings. In addition, our analysis also pinpoints the importance of choosing a diverging scaling parameter when using Gaussian kernels and suggests a data-driven choice of the scaling parameter that yields tests optimal, up to an iterated logarithmic factor, over a wide range of smooth alternatives. Numerical experiments are also presented to further demonstrate the practical merits of the methodology.

Viaarxiv icon