Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ziqiang Shi

Empirical study of PROXTONE and PROXTONE$^+$ for Fast Learning of Large Scale Sparse Models

Apr 18, 2016

Ziqiang Shi, Rujie Liu

Figure 1 for Empirical study of PROXTONE and PROXTONE$^+$ for Fast Learning of Large Scale Sparse Models

Figure 2 for Empirical study of PROXTONE and PROXTONE$^+$ for Fast Learning of Large Scale Sparse Models

Abstract:PROXTONE is a novel and fast method for optimization of large scale non-smooth convex problem \cite{shi2015large}. In this work, we try to use PROXTONE method in solving large scale \emph{non-smooth non-convex} problems, for example training of sparse deep neural network (sparse DNN) or sparse convolutional neural network (sparse CNN) for embedded or mobile device. PROXTONE converges much faster than first order methods, while first order method is easy in deriving and controlling the sparseness of the solutions. Thus in some applications, in order to train sparse models fast, we propose to combine the merits of both methods, that is we use PROXTONE in the first several epochs to reach the neighborhood of an optimal solution, and then use the first order method to explore the possibility of sparsity in the following training. We call such method PROXTONE plus (PROXTONE$^+$). Both PROXTONE and PROXTONE$^+$ are tested in our experiments, and which demonstrate both methods improved convergence speed twice as fast at least on diverse sparse model learning problems, and at the same time reduce the size to 0.5\% for DNN models. The source of all the algorithms is available upon request.

* arXiv admin note: text overlap with arXiv:1311.2115 by other authors

Via

Access Paper or Ask Questions

Guarantees of Augmented Trace Norm Models in Tensor Recovery

Jul 23, 2012

Ziqiang Shi, Jiqing Han, Tieran Zheng, Shiwen Deng, Ji Li

Figure 1 for Guarantees of Augmented Trace Norm Models in Tensor Recovery

Abstract:This paper studies the recovery guarantees of the models of minimizing $\|\mathcal{X}\|_*+\frac{1}{2\alpha}\|\mathcal{X}\|_F^2$ where $\mathcal{X}$ is a tensor and $\|\mathcal{X}\|_*$ and $\|\mathcal{X}\|_F$ are the trace and Frobenius norm of respectively. We show that they can efficiently recover low-rank tensors. In particular, they enjoy exact guarantees similar to those known for minimizing $\|\mathcal{X}\|_*$ under the conditions on the sensing operator such as its null-space property, restricted isometry property, or spherical section property. To recover a low-rank tensor $\mathcal{X}^0$, minimizing $\|\mathcal{X}\|_*+\frac{1}{2\alpha}\|\mathcal{X}\|_F^2$ returns the same solution as minimizing $\|\mathcal{X}\|_*$ almost whenever $\alpha\geq10\mathop {\max}\limits_{i}\|X^0_{(i)}\|_2$.

Via

Access Paper or Ask Questions

Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification

Dec 19, 2011

Ziqiang Shi, Jiqing Han, Tieran Zheng, Shiwen Deng

Figure 1 for Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification

Figure 2 for Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification

Figure 3 for Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification

Figure 4 for Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification

Abstract:In this paper, a novel framework based on trace norm minimization for audio segment is proposed. In this framework, both the feature extraction and classification are obtained by solving corresponding convex optimization problem with trace norm regularization. For feature extraction, robust principle component analysis (robust PCA) via minimization a combination of the nuclear norm and the $\ell_1$-norm is used to extract low-rank features which are robust to white noise and gross corruption for audio segments. These low-rank features are fed to a linear classifier where the weight and bias are learned by solving similar trace norm constrained problems. For this classifier, most methods find the weight and bias in batch-mode learning, which makes them inefficient for large-scale problems. In this paper, we propose an online framework using accelerated proximal gradient method. This framework has a main advantage in memory cost. In addition, as a result of the regularization formulation of matrix classification, the Lipschitz constant was given explicitly, and hence the step size estimation of general proximal gradient method was omitted in our approach. Experiments on real data sets for laugh/non-laugh and applause/non-applause classification indicate that this novel framework is effective and noise robust.

Via

Access Paper or Ask Questions