Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Soumith Chintala

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Dec 03, 2019
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, Soumith Chintala

Figure 1 for PyTorch: An Imperative Style, High-Performance Deep Learning Library

Figure 2 for PyTorch: An Imperative Style, High-Performance Deep Learning Library

Figure 3 for PyTorch: An Imperative Style, High-Performance Deep Learning Library

Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it provides an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several common benchmarks.

* 12 pages, 3 figures, NeurIPS 2019

Via

Access Paper or Ask Questions

Generalized Inner Loop Meta-Learning

Oct 07, 2019
Edward Grefenstette, Brandon Amos, Denis Yarats, Phu Mon Htut, Artem Molchanov, Franziska Meier, Douwe Kiela, Kyunghyun Cho, Soumith Chintala

Figure 1 for Generalized Inner Loop Meta-Learning

Figure 2 for Generalized Inner Loop Meta-Learning

Figure 3 for Generalized Inner Loop Meta-Learning

Figure 4 for Generalized Inner Loop Meta-Learning

Many (but not all) approaches self-qualifying as "meta-learning" in deep learning and reinforcement learning fit a common pattern of approximating the solution to a nested optimization problem. In this paper, we give a formalization of this shared pattern, which we call GIMLI, prove its general requirements, and derive a general-purpose algorithm for implementing similar approaches. Based on this analysis and algorithm, we describe a library of our design, higher, which we share with the community to assist and enable future research into these kinds of meta-learning approaches. We end the paper by showcasing the practical applications of this framework and library through illustrative experiments and ablation studies which they facilitate.

* 17 pages, 3 figures, 1 algorithm

Via

Access Paper or Ask Questions

Wasserstein GAN

Dec 06, 2017
Martin Arjovsky, Soumith Chintala, Léon Bottou

We introduce a new algorithm named WGAN, an alternative to traditional GAN training. In this new model, we show that we can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches. Furthermore, we show that the corresponding optimization problem is sound, and provide extensive theoretical work highlighting the deep connections to other distances between distributions.

Via

Access Paper or Ask Questions

Discovering Causal Signals in Images

Oct 31, 2017
David Lopez-Paz, Robert Nishihara, Soumith Chintala, Bernhard Schölkopf, Léon Bottou

Figure 1 for Discovering Causal Signals in Images

Figure 2 for Discovering Causal Signals in Images

Figure 3 for Discovering Causal Signals in Images

Figure 4 for Discovering Causal Signals in Images

This paper establishes the existence of observable footprints that reveal the "causal dispositions" of the object categories appearing in collections of images. We achieve this goal in two steps. First, we take a learning approach to observational causal discovery, and build a classifier that achieves state-of-the-art performance on finding the causal direction between pairs of random variables, given samples from their joint distribution. Second, we use our causal direction classifier to effectively distinguish between features of objects and features of their contexts in collections of static images. Our experiments demonstrate the existence of a relation between the direction of causality and the difference between objects and their contexts, and by the same token, the existence of observable signals that reveal the causal dispositions of objects.

Via

Access Paper or Ask Questions

Transformation-Based Models of Video Sequences

Apr 24, 2017
Joost van Amersfoort, Anitha Kannan, Marc'Aurelio Ranzato, Arthur Szlam, Du Tran, Soumith Chintala

Figure 1 for Transformation-Based Models of Video Sequences

Figure 2 for Transformation-Based Models of Video Sequences

Figure 3 for Transformation-Based Models of Video Sequences

Figure 4 for Transformation-Based Models of Video Sequences

In this work we propose a simple unsupervised approach for next frame prediction in video. Instead of directly predicting the pixels in a frame given past frames, we predict the transformations needed for generating the next frame in a sequence, given the transformations of the past frames. This leads to sharper results, while using a smaller prediction model. In order to enable a fair comparison between different video frame prediction models, we also propose a new evaluation protocol. We use generated frames as input to a classifier trained with ground truth sequences. This criterion guarantees that models scoring high are those producing sequences which preserve discrim- inative features, as opposed to merely penalizing any deviation, plausible or not, from the ground truth. Our proposed approach compares favourably against more sophisticated ones on the UCF-101 data set, while also being more efficient in terms of the number of parameters and computational cost.

Via

Access Paper or Ask Questions

Training Language Models Using Target-Propagation

Feb 15, 2017
Sam Wiseman, Sumit Chopra, Marc'Aurelio Ranzato, Arthur Szlam, Ruoyu Sun, Soumith Chintala, Nicolas Vasilache

Figure 1 for Training Language Models Using Target-Propagation

Figure 2 for Training Language Models Using Target-Propagation

Figure 3 for Training Language Models Using Target-Propagation

Figure 4 for Training Language Models Using Target-Propagation

While Truncated Back-Propagation through Time (BPTT) is the most popular approach to training Recurrent Neural Networks (RNNs), it suffers from being inherently sequential (making parallelization difficult) and from truncating gradient flow between distant time-steps. We investigate whether Target Propagation (TPROP) style approaches can address these shortcomings. Unfortunately, extensive experiments suggest that TPROP generally underperforms BPTT, and we end with an analysis of this phenomenon, and suggestions for future work.

Via

Access Paper or Ask Questions

Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

Nov 26, 2016
Nicolas Usunier, Gabriel Synnaeve, Zeming Lin, Soumith Chintala

Figure 1 for Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

Figure 2 for Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

Figure 3 for Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

We consider scenarios from the real-time strategy game StarCraft as new benchmarks for reinforcement learning algorithms. We propose micromanagement tasks, which present the problem of the short-term, low-level control of army members during a battle. From a reinforcement learning point of view, these scenarios are challenging because the state-action space is very large, and because there is no obvious feature representation for the state-action evaluation function. We describe our approach to tackle the micromanagement scenarios with deep neural network controllers from raw state features given by the game engine. In addition, we present a heuristic reinforcement learning algorithm which combines direct exploration in the policy space and backpropagation. This algorithm allows for the collection of traces for learning using deterministic policies, which appears much more efficient than, for example, {\epsilon}-greedy exploration. Experiments show that with this algorithm, we successfully learn non-trivial strategies for scenarios with armies of up to 15 agents, where both Q-learning and REINFORCE struggle.

* 18 pages, 1 figure (2 plots), 2 tables

Via

Access Paper or Ask Questions

Semantic Segmentation using Adversarial Networks

Nov 25, 2016
Pauline Luc, Camille Couprie, Soumith Chintala, Jakob Verbeek

Figure 1 for Semantic Segmentation using Adversarial Networks

Figure 2 for Semantic Segmentation using Adversarial Networks

Figure 3 for Semantic Segmentation using Adversarial Networks

Figure 4 for Semantic Segmentation using Adversarial Networks

Adversarial training has been shown to produce state of the art results for generative image modeling. In this paper we propose an adversarial training approach to train semantic segmentation models. We train a convolutional semantic segmentation network along with an adversarial network that discriminates segmentation maps coming either from the ground truth or from the segmentation network. The motivation for our approach is that it can detect and correct higher-order inconsistencies between ground truth segmentation maps and the ones produced by the segmentation net. Our experiments show that our adversarial training approach leads to improved accuracy on the Stanford Background and PASCAL VOC 2012 datasets.

* NIPS Workshop on Adversarial Training, Dec 2016, Barcelona, Spain

Via

Access Paper or Ask Questions

TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

Nov 03, 2016
Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala, Timothée Lacroix, Zeming Lin, Florian Richoux, Nicolas Usunier

Figure 1 for TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

We present TorchCraft, a library that enables deep learning research on Real-Time Strategy (RTS) games such as StarCraft: Brood War, by making it easier to control these games from a machine learning framework, here Torch. This white paper argues for using RTS games as a benchmark for AI research, and describes the design and components of TorchCraft.

Via

Access Paper or Ask Questions