Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ian Goodfellow

Universite de Montreal

Adversarial Autoencoders

May 25, 2016

Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, Brendan Frey

Abstract:In this paper, we propose the "adversarial autoencoder" (AAE), which is a probabilistic autoencoder that uses the recently proposed generative adversarial networks (GAN) to perform variational inference by matching the aggregated posterior of the hidden code vector of the autoencoder with an arbitrary prior distribution. Matching the aggregated posterior to the prior ensures that generating from any part of prior space results in meaningful samples. As a result, the decoder of the adversarial autoencoder learns a deep generative model that maps the imposed prior to the data distribution. We show how the adversarial autoencoder can be used in applications such as semi-supervised classification, disentangling style and content of images, unsupervised clustering, dimensionality reduction and data visualization. We performed experiments on MNIST, Street View House Numbers and Toronto Face datasets and show that adversarial autoencoders achieve competitive results in generative modeling and semi-supervised classification tasks.

Via

Access Paper or Ask Questions

Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

May 24, 2016

Nicolas Papernot, Patrick McDaniel, Ian Goodfellow

Figure 1 for Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Figure 2 for Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Figure 3 for Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Figure 4 for Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Abstract:Many machine learning models are vulnerable to adversarial examples: inputs that are specially crafted to cause a machine learning model to produce an incorrect output. Adversarial examples that affect one model often affect another model, even if the two models have different architectures or were trained on different training sets, so long as both models were trained to perform the same task. An attacker may therefore train their own substitute model, craft adversarial examples against the substitute, and transfer them to a victim model, with very little information about the victim. Recent work has further developed a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack. We extend these recent techniques using reservoir sampling to greatly enhance the efficiency of the training procedure for the substitute model. We introduce new transferability attacks between previously unexplored (substitute, victim) pairs of machine learning model classes, most notably SVMs and decision trees. We demonstrate our attacks on two commercial machine learning classification systems from Amazon (96.19% misclassification rate) and Google (88.94%) using only 800 queries of the victim model, thereby showing that existing machine learning approaches are in general vulnerable to systematic black-box attacks regardless of their structure.

Via

Access Paper or Ask Questions

Theano: A Python framework for fast computation of mathematical expressions

May 09, 2016

The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov(+103 more)

Figure 1 for Theano: A Python framework for fast computation of mathematical expressions

Figure 2 for Theano: A Python framework for fast computation of mathematical expressions

Figure 3 for Theano: A Python framework for fast computation of mathematical expressions

Figure 4 for Theano: A Python framework for fast computation of mathematical expressions

Abstract:Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.

* 19 pages, 5 figures

Via

Access Paper or Ask Questions

Net2Net: Accelerating Learning via Knowledge Transfer

Apr 23, 2016

Tianqi Chen, Ian Goodfellow, Jonathon Shlens

Figure 1 for Net2Net: Accelerating Learning via Knowledge Transfer

Figure 2 for Net2Net: Accelerating Learning via Knowledge Transfer

Figure 3 for Net2Net: Accelerating Learning via Knowledge Transfer

Figure 4 for Net2Net: Accelerating Learning via Knowledge Transfer

Abstract:We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of function-preserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.

* ICLR 2016 submission

Via

Access Paper or Ask Questions

Improving the Robustness of Deep Neural Networks via Stability Training

Apr 15, 2016

Stephan Zheng, Yang Song, Thomas Leung, Ian Goodfellow

Figure 1 for Improving the Robustness of Deep Neural Networks via Stability Training

Figure 2 for Improving the Robustness of Deep Neural Networks via Stability Training

Figure 3 for Improving the Robustness of Deep Neural Networks via Stability Training

Figure 4 for Improving the Robustness of Deep Neural Networks via Stability Training

Abstract:In this paper we address the issue of output instability of deep neural networks: small perturbations in the visual input can significantly distort the feature embeddings and output of a neural network. Such instability affects many deep architectures with state-of-the-art performance on a wide range of computer vision tasks. We present a general stability training method to stabilize deep networks against small input distortions that result from various types of common image processing, such as compression, rescaling, and cropping. We validate our method by stabilizing the state-of-the-art Inception architecture against these types of distortions. In addition, we demonstrate that our stabilized model gives robust state-of-the-art performance on large-scale near-duplicate detection, similar-image ranking, and classification on noisy datasets.

* Published in CVPR 2016

Via

Access Paper or Ask Questions

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Mar 16, 2016

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin(+30 more)

Figure 1 for TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Figure 2 for TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Figure 3 for TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Figure 4 for TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Abstract:TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.

* Version 2 updates only the metadata, to correct the formatting of Mart\'in Abadi's name

Via

Access Paper or Ask Questions

Efficient Per-Example Gradient Computations

Oct 09, 2015

Ian Goodfellow

Abstract:This technical report describes an efficient technique for computing the norm of the gradient of the loss function for a neural network with respect to its parameters. This gradient norm can be computed efficiently for every example.

* This revision fixed some typos. Many thanks to Hugo Larochelle for reporting them!

Via

Access Paper or Ask Questions

Intriguing properties of neural networks

Feb 19, 2014

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus

Figure 1 for Intriguing properties of neural networks

Figure 2 for Intriguing properties of neural networks

Figure 3 for Intriguing properties of neural networks

Figure 4 for Intriguing properties of neural networks

Abstract:Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties. First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks. Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extend. We can cause the network to misclassify an image by applying a certain imperceptible perturbation, which is found by maximizing the network's prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.

Via

Access Paper or Ask Questions

Joint Training of Deep Boltzmann Machines

Dec 12, 2012

Ian Goodfellow, Aaron Courville, Yoshua Bengio

Abstract:We introduce a new method for training deep Boltzmann machines jointly. Prior methods require an initial learning pass that trains the deep Boltzmann machine greedily, one layer at a time, or do not perform well on classifi- cation tasks.

* 4 pages

Via

Access Paper or Ask Questions

Theano: new features and speed improvements

Nov 23, 2012

Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian Goodfellow, Arnaud Bergeron, Nicolas Bouchard, David Warde-Farley, Yoshua Bengio

Figure 1 for Theano: new features and speed improvements

Figure 2 for Theano: new features and speed improvements

Figure 3 for Theano: new features and speed improvements

Abstract:Theano is a linear algebra compiler that optimizes a user's symbolically-specified mathematical computations to produce efficient low-level implementations. In this paper, we present new features and efficiency improvements to Theano, and benchmarks demonstrating Theano's performance relative to Torch7, a recently introduced machine learning library, and to RNNLM, a C++ library targeted at recurrent neural networks.

* Presented at the Deep Learning Workshop, NIPS 2012

Via

Access Paper or Ask Questions