Nonnegative matrix factorization (NMF) has been shown to be identifiable under the separability assumption, under which all the columns(or rows) of the input data matrix belong to the convex cone generated by only a few of these columns(or rows) [1]. In real applications, however, such separability assumption is hard to satisfy. Following [4] and [5], in this paper, we look at the Linear Programming (LP) based reformulation to locate the extreme rays of the convex cone but in a noisy setting. Furthermore, in order to deal with the large scale data, we employ First-Order Methods (FOM) to mitigate the computational complexity of LP, which primarily results from a large number of constraints. We show the performance of the algorithm on real and synthetic data sets.
A robust algorithm for non-negative matrix factorization (NMF) is presented in this paper with the purpose of dealing with large-scale data, where the separability assumption is satisfied. In particular, we modify the Linear Programming (LP) algorithm of [9] by introducing a reduced set of constraints for exact NMF. In contrast to the previous approaches, the proposed algorithm does not require the knowledge of factorization rank (extreme rays [3] or topics [7]). Furthermore, motivated by a similar problem arising in the context of metabolic network analysis [13], we consider an entirely different regime where the number of extreme rays or topics can be much larger than the dimension of the data vectors. The performance of the algorithm for different synthetic data sets are provided.