We study the classic cross approximation of matrices based on the maximal volume submatrices. Our main results consist of an improvement of a classic estimate for matrix cross approximation and a greedy approach for finding the maximal volume submatrices. Indeed, we present a new proof of a classic estimate of the inequality with an improved constant. Also, we present a family of greedy maximal volume algorithms which improve the error bound of cross approximation of a matrix in the Chebyshev norm and also improve the computational efficiency of classic maximal volume algorithm. The proposed algorithms are shown to have theoretical guarantees of convergence. Finally, we present two applications: one is image compression and the other is least squares approximation of continuous functions. Our numerical results in the end of the paper demonstrate the effective performances of our approach.
Local clustering problem aims at extracting a small local structure inside a graph without the necessity of knowing the entire graph structure. As the local structure is usually small in size compared to the entire graph, one can think of it as a compressive sensing problem where the indices of target cluster can be thought as a sparse solution to a linear system. In this paper, we propose a new semi-supervised local cluster extraction approach by applying the idea of compressive sensing based on two pioneering works under the same framework. Our approves improves the existing works by making the initial cut to be the entire graph and hence overcomes a major limitation of existing works, which is the low quality of initial cut. Extensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our approach.
A least square semi-supervised local clustering algorithm based on the idea of compressed sensing are proposed to extract clusters from a graph with known adjacency matrix. The algorithm is based on a two stage approaches similar to the one in \cite{LaiMckenzie2020}. However, under a weaker assumption and with less computational complexity than the one in \cite{LaiMckenzie2020}, the algorithm is shown to be able to find a desired cluster with high probability. Several numerical experiments including the synthetic data and real data such as MNIST, AT\&T and YaleB human faces data sets are conducted to demonstrate the performance of our algorithm.