Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Junzhou Huang

Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization

Jun 21, 2018
Jiaxiang Wu, Weidong Huang, Junzhou Huang, Tong Zhang

Figure 1 for Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization

Figure 2 for Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization

Figure 3 for Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization

Figure 4 for Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization

Large-scale distributed optimization is of great importance in various applications. For data-parallel based distributed learning, the inter-node gradient communication often becomes the performance bottleneck. In this paper, we propose the error compensated quantized stochastic gradient descent algorithm to improve the training efficiency. Local gradients are quantized to reduce the communication overhead, and accumulated quantization error is utilized to speed up the convergence. Furthermore, we present theoretical analysis on the convergence behaviour, and demonstrate its advantage over competitors. Extensive experiments indicate that our algorithm can compress gradients by a factor of up to two magnitudes without performance degradation.

* Accepted by ICML 2018

Via

Access Paper or Ask Questions

Nonparametric Topic Modeling with Neural Inference

Jun 18, 2018
Xuefei Ning, Yin Zheng, Zhuxi Jiang, Yu Wang, Huazhong Yang, Junzhou Huang

Figure 1 for Nonparametric Topic Modeling with Neural Inference

Figure 2 for Nonparametric Topic Modeling with Neural Inference

Figure 3 for Nonparametric Topic Modeling with Neural Inference

This work focuses on combining nonparametric topic models with Auto-Encoding Variational Bayes (AEVB). Specifically, we first propose iTM-VAE, where the topics are treated as trainable parameters and the document-specific topic proportions are obtained by a stick-breaking construction. The inference of iTM-VAE is modeled by neural networks such that it can be computed in a simple feed-forward manner. We also describe how to introduce a hyper-prior into iTM-VAE so as to model the uncertainty of the prior parameter. Actually, the hyper-prior technique is quite general and we show that it can be applied to other AEVB based models to alleviate the {\it collapse-to-prior} problem elegantly. Moreover, we also propose HiTM-VAE, where the document-specific topic distributions are generated in a hierarchical manner. HiTM-VAE is even more flexible and can generate topic distributions with better variability. Experimental results on 20News and Reuters RCV1-V2 datasets show that the proposed models outperform the state-of-the-art baselines significantly. The advantages of the hyper-prior technique and the hierarchical model construction are also confirmed by experiments.

* 11 pages, 2 figures

Via

Access Paper or Ask Questions

Adversarial Learning with Local Coordinate Coding

Jun 14, 2018
Jiezhang Cao, Yong Guo, Qingyao Wu, Chunhua Shen, Junzhou Huang, Mingkui Tan

Figure 1 for Adversarial Learning with Local Coordinate Coding

Figure 2 for Adversarial Learning with Local Coordinate Coding

Figure 3 for Adversarial Learning with Local Coordinate Coding

Figure 4 for Adversarial Learning with Local Coordinate Coding

Generative adversarial networks (GANs) aim to generate realistic data from some prior distribution (e.g., Gaussian noises). However, such prior distribution is often independent of real data and thus may lose semantic information (e.g., geometric structure or content in images) of data. In practice, the semantic information might be represented by some latent distribution learned from data, which, however, is hard to be used for sampling in GANs. In this paper, rather than sampling from the pre-defined prior distribution, we propose a Local Coordinate Coding (LCC) based sampling method to improve GANs. We derive a generalization bound for LCC based GANs and prove that a small dimensional input is sufficient to achieve good generalization. Extensive experiments on various real-world datasets demonstrate the effectiveness of the proposed method.

* 14 pages, 7 figures, 4 tables

Via

Access Paper or Ask Questions

Adaptive Cost-sensitive Online Classification

Apr 06, 2018
Peilin Zhao, Yifan Zhang, Min Wu, Steven C. H. Hoi, Mingkui Tan, Junzhou Huang

Figure 1 for Adaptive Cost-sensitive Online Classification

Figure 2 for Adaptive Cost-sensitive Online Classification

Figure 3 for Adaptive Cost-sensitive Online Classification

Figure 4 for Adaptive Cost-sensitive Online Classification

Cost-Sensitive Online Classification has drawn extensive attention in recent years, where the main approach is to directly online optimize two well-known cost-sensitive metrics: (i) weighted sum of sensitivity and specificity; (ii) weighted misclassification cost. However, previous existing methods only considered first-order information of data stream. It is insufficient in practice, since many recent studies have proved that incorporating second-order information enhances the prediction performance of classification models. Thus, we propose a family of cost-sensitive online classification algorithms with adaptive regularization in this paper. We theoretically analyze the proposed algorithms and empirically validate their effectiveness and properties in extensive experiments. Then, for better trade off between the performance and efficiency, we further introduce the sketching technique into our algorithms, which significantly accelerates the computational speed with quite slight performance loss. Finally, we apply our algorithms to tackle several online anomaly detection tasks from real world. Promising results prove that the proposed algorithms are effective and efficient in solving cost-sensitive online classification problems in various real-world domains.

Via

Access Paper or Ask Questions

End-to-End Learning of Motion Representation for Video Understanding

Apr 02, 2018
Lijie Fan, Wenbing Huang, Chuang Gan, Stefano Ermon, Boqing Gong, Junzhou Huang

Figure 1 for End-to-End Learning of Motion Representation for Video Understanding

Figure 2 for End-to-End Learning of Motion Representation for Video Understanding

Figure 3 for End-to-End Learning of Motion Representation for Video Understanding

Figure 4 for End-to-End Learning of Motion Representation for Video Understanding

Despite the recent success of end-to-end learned representations, hand-crafted optical flow features are still widely used in video analysis tasks. To fill this gap, we propose TVNet, a novel end-to-end trainable neural network, to learn optical-flow-like features from data. TVNet subsumes a specific optical flow solver, the TV-L1 method, and is initialized by unfolding its optimization iterations as neural layers. TVNet can therefore be used directly without any extra learning. Moreover, it can be naturally concatenated with other task-specific networks to formulate an end-to-end architecture, thus making our method more efficient than current multi-stage approaches by avoiding the need to pre-compute and store features on disk. Finally, the parameters of the TVNet can be further fine-tuned by end-to-end training. This enables TVNet to learn richer and task-specific patterns beyond exact optical flow. Extensive experiments on two action recognition benchmarks verify the effectiveness of the proposed approach. Our TVNet achieves better accuracies than all compared methods, while being competitive with the fastest counterpart in terms of features extraction time.

* CVPR 2018 spotlight. The first two authors contributed equally to this paper

Via

Access Paper or Ask Questions

Robust Actor-Critic Contextual Bandit for Mobile Health (mHealth) Interventions

Feb 27, 2018
Feiyun Zhu, Jun Guo, Ruoyu Li, Junzhou Huang

Figure 1 for Robust Actor-Critic Contextual Bandit for Mobile Health (mHealth) Interventions

Figure 2 for Robust Actor-Critic Contextual Bandit for Mobile Health (mHealth) Interventions

Figure 3 for Robust Actor-Critic Contextual Bandit for Mobile Health (mHealth) Interventions

Figure 4 for Robust Actor-Critic Contextual Bandit for Mobile Health (mHealth) Interventions

We consider the actor-critic contextual bandit for the mobile health (mHealth) intervention. State-of-the-art decision-making algorithms generally ignore the outliers in the dataset. In this paper, we propose a novel robust contextual bandit method for the mHealth. It can achieve the conflicting goal of reducing the influence of outliers while seeking for a similar solution compared with the state-of-the-art contextual bandit methods on the datasets without outliers. Such performance relies on two technologies: (1) the capped-$\ell_{2}$ norm; (2) a reliable method to set the thresholding hyper-parameter, which is inspired by one of the most fundamental techniques in the statistics. Although the model is non-convex and non-differentiable, we propose an effective reweighted algorithm and provide solid theoretical analyses. We prove that the proposed algorithm can find sufficiently decreasing points after each iteration and finally converges after a finite number of iterations. Extensive experiment results on two datasets demonstrate that our method can achieve almost identical results compared with state-of-the-art contextual bandit methods on the dataset without outliers, and significantly outperform those state-of-the-art methods on the badly noised dataset with outliers in a variety of parameter settings.

Via

Access Paper or Ask Questions

Adaptive Graph Convolutional Neural Networks

Jan 10, 2018
Ruoyu Li, Sheng Wang, Feiyun Zhu, Junzhou Huang

Figure 1 for Adaptive Graph Convolutional Neural Networks

Figure 2 for Adaptive Graph Convolutional Neural Networks

Figure 3 for Adaptive Graph Convolutional Neural Networks

Figure 4 for Adaptive Graph Convolutional Neural Networks

Graph Convolutional Neural Networks (Graph CNNs) are generalizations of classical CNNs to handle graph data such as molecular data, point could and social networks. Current filters in graph CNNs are built for fixed and shared graph structure. However, for most real data, the graph structures varies in both size and connectivity. The paper proposes a generalized and flexible graph CNN taking data of arbitrary graph structure as input. In that way a task-driven adaptive graph is learned for each graph data while training. To efficiently learn the graph, a distance metric learning is proposed. Extensive experiments on nine graph-structured datasets have demonstrated the superior performance improvement on both convergence speed and predictive accuracy.

* The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 8 pages

Via

Access Paper or Ask Questions

Cohesion-based Online Actor-Critic Reinforcement Learning for mHealth Intervention

Aug 23, 2017
Feiyun Zhu, Peng Liao, Xinliang Zhu, Yaowen Yao, Junzhou Huang

Figure 1 for Cohesion-based Online Actor-Critic Reinforcement Learning for mHealth Intervention

Figure 2 for Cohesion-based Online Actor-Critic Reinforcement Learning for mHealth Intervention

Figure 3 for Cohesion-based Online Actor-Critic Reinforcement Learning for mHealth Intervention

Figure 4 for Cohesion-based Online Actor-Critic Reinforcement Learning for mHealth Intervention

In the wake of the vast population of smart device users worldwide, mobile health (mHealth) technologies are hopeful to generate positive and wide influence on people's health. They are able to provide flexible, affordable and portable health guides to device users. Current online decision-making methods for mHealth assume that the users are completely heterogeneous. They share no information among users and learn a separate policy for each user. However, data for each user is very limited in size to support the separate online learning, leading to unstable policies that contain lots of variances. Besides, we find the truth that a user may be similar with some, but not all, users, and connected users tend to have similar behaviors. In this paper, we propose a network cohesion constrained (actor-critic) Reinforcement Learning (RL) method for mHealth. The goal is to explore how to share information among similar users to better convert the limited user information into sharper learned policies. To the best of our knowledge, this is the first online actor-critic RL for mHealth and first network cohesion constrained (actor-critic) RL method in all applications. The network cohesion is important to derive effective policies. We come up with a novel method to learn the network by using the warm start trajectory, which directly reflects the users' property. The optimization of our model is difficult and very different from the general supervised learning due to the indirect observation of values. As a contribution, we propose two algorithms for the proposed online RLs. Apart from mHealth, the proposed methods can be easily applied or adapted to other health-related tasks. Extensive experiment results on the HeartSteps dataset demonstrates that in a variety of parameter settings, the proposed two methods obtain obvious improvements over the state-of-the-art methods.

Via

Access Paper or Ask Questions

Robust Contextual Bandit via the Capped-$\ell_{2}$ norm

Aug 17, 2017
Feiyun Zhu, Xinliang Zhu, Sheng Wang, Jiawen Yao, Junzhou Huang

$Figure 1 for Robust Contextual Bandit via the Capped-$\ell_{2}$ norm$

$Figure 2 for Robust Contextual Bandit via the Capped-$\ell_{2}$ norm$

This paper considers the actor-critic contextual bandit for the mobile health (mHealth) intervention. The state-of-the-art decision-making methods in mHealth generally assume that the noise in the dynamic system follows the Gaussian distribution. Those methods use the least-square-based algorithm to estimate the expected reward, which is prone to the existence of outliers. To deal with the issue of outliers, we propose a novel robust actor-critic contextual bandit method for the mHealth intervention. In the critic updating, the capped-$\ell_{2}$ norm is used to measure the approximation error, which prevents outliers from dominating our objective. A set of weights could be achieved from the critic updating. Considering them gives a weighted objective for the actor updating. It provides the badly noised sample in the critic updating with zero weights for the actor updating. As a result, the robustness of both actor-critic updating is enhanced. There is a key parameter in the capped-$\ell_{2}$ norm. We provide a reliable method to properly set it by making use of one of the most fundamental definitions of outliers in statistics. Extensive experiment results demonstrate that our method can achieve almost identical results compared with the state-of-the-art methods on the dataset without outliers and dramatically outperform them on the datasets noised by outliers.

Via

Access Paper or Ask Questions

Group-driven Reinforcement Learning for Personalized mHealth Intervention

Aug 14, 2017
Feiyun Zhu, Jun Guo, Zheng Xu, Peng Liao, Junzhou Huang

Figure 1 for Group-driven Reinforcement Learning for Personalized mHealth Intervention

Due to the popularity of smartphones and wearable devices nowadays, mobile health (mHealth) technologies are promising to bring positive and wide impacts on people's health. State-of-the-art decision-making methods for mHealth rely on some ideal assumptions. Those methods either assume that the users are completely homogenous or completely heterogeneous. However, in reality, a user might be similar with some, but not all, users. In this paper, we propose a novel group-driven reinforcement learning method for the mHealth. We aim to understand how to share information among similar users to better convert the limited user information into sharper learned RL policies. Specifically, we employ the K-means clustering method to group users based on their trajectory information similarity and learn a shared RL policy for each group. Extensive experiment results have shown that our method can achieve clear gains over the state-of-the-art RL methods for mHealth.

Via

Access Paper or Ask Questions