Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Samy Bengio

Google Research

Adversarial examples in the physical world

Feb 11, 2017

Alexey Kurakin, Ian Goodfellow, Samy Bengio

Figure 1 for Adversarial examples in the physical world

Figure 2 for Adversarial examples in the physical world

Figure 3 for Adversarial examples in the physical world

Figure 4 for Adversarial examples in the physical world

Abstract:Most existing machine learning classifiers are highly vulnerable to adversarial examples. An adversarial example is a sample of input data which has been modified very slightly in a way that is intended to cause a machine learning classifier to misclassify it. In many cases, these modifications can be so subtle that a human observer does not even notice the modification at all, yet the classifier still makes a mistake. Adversarial examples pose security concerns because they could be used to perform an attack on machine learning systems, even if the adversary has no access to the underlying model. Up to now, all previous work have assumed a threat model in which the adversary can feed data directly into the machine learning classifier. This is not always the case for systems operating in the physical world, for example those which are using signals from cameras and other sensors as an input. This paper shows that even in such physical world scenarios, machine learning systems are vulnerable to adversarial examples. We demonstrate this by feeding adversarial images obtained from cell-phone camera to an ImageNet Inception classifier and measuring the classification accuracy of the system. We find that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera.

* 14 pages, 6 figures. Demo available at https://youtu.be/zQ_uMenoBCk

Via

Access Paper or Ask Questions

Adversarial Machine Learning at Scale

Feb 11, 2017

Alexey Kurakin, Ian Goodfellow, Samy Bengio

Figure 1 for Adversarial Machine Learning at Scale

Figure 2 for Adversarial Machine Learning at Scale

Figure 3 for Adversarial Machine Learning at Scale

Figure 4 for Adversarial Machine Learning at Scale

Abstract:Adversarial examples are malicious inputs designed to fool machine learning models. They often transfer from one model to another, allowing attackers to mount black box attacks without knowledge of the target model's parameters. Adversarial training is the process of explicitly training a model on adversarial examples, in order to make it more robust to attack or to reduce its test error on clean inputs. So far, adversarial training has primarily been applied to small problems. In this research, we apply adversarial training to ImageNet. Our contributions include: (1) recommendations for how to succesfully scale adversarial training to large models and datasets, (2) the observation that adversarial training confers robustness to single-step attack methods, (3) the finding that multi-step attack methods are somewhat less transferable than single-step attack methods, so single-step attacks are the best for mounting black-box attacks, and (4) resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples, because the adversarial example construction process uses the true label and the model can learn to exploit regularities in the construction process.

* 17 pages, 5 figures

Via

Access Paper or Ask Questions

Neural Combinatorial Optimization with Reinforcement Learning

Jan 12, 2017

Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, Samy Bengio

Figure 1 for Neural Combinatorial Optimization with Reinforcement Learning

Figure 2 for Neural Combinatorial Optimization with Reinforcement Learning

Figure 3 for Neural Combinatorial Optimization with Reinforcement Learning

Figure 4 for Neural Combinatorial Optimization with Reinforcement Learning

Abstract:This paper presents a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning. We focus on the traveling salesman problem (TSP) and train a recurrent network that, given a set of city coordinates, predicts a distribution over different city permutations. Using negative tour length as the reward signal, we optimize the parameters of the recurrent network using a policy gradient method. We compare learning the network parameters on a set of training graphs against learning them on individual test graphs. Despite the computational expense, without much engineering and heuristic designing, Neural Combinatorial Optimization achieves close to optimal results on 2D Euclidean graphs with up to 100 nodes. Applied to the KnapSack, another NP-hard problem, the same method obtains optimal solutions for instances with up to 200 items.

* Under review as a conference paper at ICLR 2017

Via

Access Paper or Ask Questions

Reward Augmented Maximum Likelihood for Neural Structured Prediction

Jan 04, 2017

Mohammad Norouzi, Samy Bengio, Zhifeng Chen, Navdeep Jaitly, Mike Schuster, Yonghui Wu, Dale Schuurmans

Figure 1 for Reward Augmented Maximum Likelihood for Neural Structured Prediction

Figure 2 for Reward Augmented Maximum Likelihood for Neural Structured Prediction

Figure 3 for Reward Augmented Maximum Likelihood for Neural Structured Prediction

Abstract:A key problem in structured output prediction is direct optimization of the task reward function that matters for test evaluation. This paper presents a simple and computationally efficient approach to incorporate task reward into a maximum likelihood framework. By establishing a link between the log-likelihood and expected reward objectives, we show that an optimal regularized expected reward is achieved when the conditional distribution of the outputs given the inputs is proportional to their exponentiated scaled rewards. Accordingly, we present a framework to smooth the predictive probability of the outputs using their corresponding rewards. We optimize the conditional log-probability of augmented outputs that are sampled proportionally to their exponentiated scaled rewards. Experiments on neural sequence to sequence models for speech recognition and machine translation show notable improvements over a maximum likelihood baseline by using reward augmented maximum likelihood (RAML), where the rewards are defined as the negative edit distance between the outputs and the ground truth labels.

* NIPS 2016

Via

Access Paper or Ask Questions

Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

Sep 21, 2016

Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

Figure 1 for Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

Figure 2 for Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

Figure 3 for Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

Figure 4 for Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

Abstract:Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. Finally, given the recent surge of interest in this task, a competition was organized in 2015 using the newly released COCO dataset. We describe and analyze the various improvements we applied to our own baseline and show the resulting performance in the competition, which we won ex-aequo with a team from Microsoft Research, and provide an open source implementation in TensorFlow.

* IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: PP, Issue: 99 , July 2016 )
* arXiv admin note: substantial text overlap with arXiv:1411.4555

Via

Access Paper or Ask Questions

A Neural Transducer

Aug 04, 2016

Navdeep Jaitly, David Sussillo, Quoc V. Le, Oriol Vinyals, Ilya Sutskever, Samy Bengio

Abstract:Sequence-to-sequence models have achieved impressive results on various tasks. However, they are unsuitable for tasks that require incremental predictions to be made as more data arrives or tasks that have long input sequences and output sequences. This is because they generate an output sequence conditioned on an entire input sequence. In this paper, we present a Neural Transducer that can make incremental predictions as more input arrives, without redoing the entire computation. Unlike sequence-to-sequence models, the Neural Transducer computes the next-step distribution conditioned on the partially observed input sequence and the partially generated sequence. At each time step, the transducer can decide to emit zero to many output symbols. The data can be processed using an encoder and presented as input to the transducer. The discrete decision to emit a symbol at every time step makes it difficult to learn with conventional backpropagation. It is however possible to train the transducer by using a dynamic programming algorithm to generate target discrete decisions. Our experiments show that the Neural Transducer works well in settings where it is required to produce output predictions as data come in. We also find that the Neural Transducer performs well for long sequences even when attention mechanisms are not used.

Via

Access Paper or Ask Questions

Generating Sentences from a Continuous Space

May 12, 2016

Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, Samy Bengio

Figure 1 for Generating Sentences from a Continuous Space

Figure 2 for Generating Sentences from a Continuous Space

Figure 3 for Generating Sentences from a Continuous Space

Figure 4 for Generating Sentences from a Continuous Space

Abstract:The standard recurrent neural network language model (RNNLM) generates sentences one word at a time and does not work from an explicit global sentence representation. In this work, we introduce and study an RNN-based variational autoencoder generative model that incorporates distributed latent representations of entire sentences. This factorization allows it to explicitly model holistic properties of sentences such as style, topic, and high-level syntactic features. Samples from the prior over these sentence representations remarkably produce diverse and well-formed sentences through simple deterministic decoding. By examining paths through this latent space, we are able to generate coherent novel sentences that interpolate between known sentences. We present techniques for solving the difficult learning problem presented by this model, demonstrate its effectiveness in imputing missing words, explore many interesting properties of the model's latent sentence space, and present negative results on the use of the model in language modeling.

* SIGNLL Conference on Computational Natural Language Learning (CONLL), 2016
* First two authors contributed equally. Work was done when all authors were at Google, Inc

Via

Access Paper or Ask Questions

Order Matters: Sequence to sequence for sets

Feb 23, 2016

Oriol Vinyals, Samy Bengio, Manjunath Kudlur

Figure 1 for Order Matters: Sequence to sequence for sets

Figure 2 for Order Matters: Sequence to sequence for sets

Figure 3 for Order Matters: Sequence to sequence for sets

Figure 4 for Order Matters: Sequence to sequence for sets

Abstract:Sequences have become first class citizens in supervised learning thanks to the resurgence of recurrent neural networks. Many complex tasks that require mapping from or to a sequence of observations can now be formulated with the sequence-to-sequence (seq2seq) framework which employs the chain rule to efficiently represent the joint probability of sequences. In many cases, however, variable sized inputs and/or outputs might not be naturally expressed as sequences. For instance, it is not clear how to input a set of numbers into a model where the task is to sort them; similarly, we do not know how to organize outputs when they correspond to random variables and the task is to model their unknown joint probability. In this paper, we first show using various examples that the order in which we organize input and/or output data matters significantly when learning an underlying model. We then discuss an extension of the seq2seq framework that goes beyond sequences and handles input sets in a principled way. In addition, we propose a loss which, by searching over possible orders during training, deals with the lack of structure of output sets. We show empirical evidence of our claims regarding ordering, and on the modifications to the seq2seq framework on benchmark language modeling and parsing tasks, as well as two artificial tasks -- sorting numbers and estimating the joint probability of unknown graphical models.

* Accepted as a conference paper at ICLR 2015

Via

Access Paper or Ask Questions

End-to-End Text-Dependent Speaker Verification

Sep 27, 2015

Georg Heigold, Ignacio Moreno, Samy Bengio, Noam Shazeer

Figure 1 for End-to-End Text-Dependent Speaker Verification

Figure 2 for End-to-End Text-Dependent Speaker Verification

Figure 3 for End-to-End Text-Dependent Speaker Verification

Figure 4 for End-to-End Text-Dependent Speaker Verification

Abstract:In this paper we present a data-driven, integrated approach to speaker verification, which maps a test utterance and a few reference utterances directly to a single score for verification and jointly optimizes the system's components using the same evaluation protocol and metric as at test time. Such an approach will result in simple and efficient systems, requiring little domain-specific knowledge and making few model assumptions. We implement the idea by formulating the problem as a single neural network architecture, including the estimation of a speaker model on only a few utterances, and evaluate it on our internal "Ok Google" benchmark for text-dependent speaker verification. The proposed approach appears to be very effective for big data applications like ours that require highly accurate, easy-to-maintain systems with a small footprint.

* submitted to ICASSP 2016

Via

Access Paper or Ask Questions

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Sep 23, 2015

Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer

Figure 1 for Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Figure 2 for Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Figure 3 for Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Figure 4 for Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Abstract:Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists of maximizing the likelihood of each token in the sequence given the current (recurrent) state and the previous token. At inference, the unknown previous token is then replaced by a token generated by the model itself. This discrepancy between training and inference can yield errors that can accumulate quickly along the generated sequence. We propose a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided scheme which mostly uses the generated token instead. Experiments on several sequence prediction tasks show that this approach yields significant improvements. Moreover, it was used successfully in our winning entry to the MSCOCO image captioning challenge, 2015.

Via

Access Paper or Ask Questions