Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yunus Saatchi

Bayesian GAN

Nov 08, 2017

Yunus Saatchi, Andrew Gordon Wilson

Abstract:Generative adversarial networks (GANs) can implicitly learn rich distributions over images, audio, and data which are hard to model with an explicit likelihood. We present a practical Bayesian formulation for unsupervised and semi-supervised learning with GANs. Within this framework, we use stochastic gradient Hamiltonian Monte Carlo to marginalize the weights of the generator and discriminator networks. The resulting approach is straightforward and obtains good performance without any standard interventions such as feature matching, or mini-batch discrimination. By exploring an expressive posterior over the parameters of the generator, the Bayesian GAN avoids mode-collapse, produces interpretable and diverse candidate samples, and provides state-of-the-art quantitative results for semi-supervised learning on benchmarks including SVHN, CelebA, and CIFAR-10, outperforming DCGAN, Wasserstein GANs, and DCGAN ensembles.

* Advances in Neural Information Processing Systems 30 (NIPS), 2017
* Updated to the version that appears at Advances in Neural Information Processing Systems 30 (NIPS), 2017

Via

Access Paper or Ask Questions

Learning Scalable Deep Kernels with Recurrent Structure

Oct 05, 2017

Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu, Eric P. Xing

Figure 1 for Learning Scalable Deep Kernels with Recurrent Structure

Figure 2 for Learning Scalable Deep Kernels with Recurrent Structure

Figure 3 for Learning Scalable Deep Kernels with Recurrent Structure

Figure 4 for Learning Scalable Deep Kernels with Recurrent Structure

Abstract:Many applications in speech, robotics, finance, and biology deal with sequential data, where ordering matters and recurrent structures are common. However, this structure cannot be easily captured by standard kernel functions. To model such structure, we propose expressive closed-form kernel functions for Gaussian processes. The resulting model, GP-LSTM, fully encapsulates the inductive biases of long short-term memory (LSTM) recurrent networks, while retaining the non-parametric probabilistic advantages of Gaussian processes. We learn the properties of the proposed kernels by optimizing the Gaussian process marginal likelihood using a new provably convergent semi-stochastic gradient procedure and exploit the structure of these kernels for scalable training and prediction. This approach provides a practical representation for Bayesian LSTMs. We demonstrate state-of-the-art performance on several benchmarks, and thoroughly investigate a consequential autonomous driving application, where the predictive uncertainties provided by GP-LSTM are uniquely valuable.

* Journal of Machine Learning Research (JMLR), JMLR 18(82):1-37, 2017
* 37 pages, 7 figures, 5 tables. Updated to the final version that appears in JMLR, 18(82):1-37, 2017

Via

Access Paper or Ask Questions