Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Prafulla Dhariwal

Tony

Jukebox: A Generative Model for Music

Apr 30, 2020

Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever

Figure 1 for Jukebox: A Generative Model for Music

Figure 2 for Jukebox: A Generative Model for Music

Figure 3 for Jukebox: A Generative Model for Music

Figure 4 for Jukebox: A Generative Model for Music

Abstract:We introduce Jukebox, a model that generates music with singing in the raw audio domain. We tackle the long context of raw audio using a multi-scale VQ-VAE to compress it to discrete codes, and modeling those using autoregressive Transformers. We show that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes. We can condition on artist and genre to steer the musical and vocal style, and on unaligned lyrics to make the singing more controllable. We are releasing thousands of non cherry-picked samples at https://jukebox.openai.com, along with model weights and code at https://github.com/openai/jukebox

Via

Access Paper or Ask Questions

Glow: Generative Flow with Invertible 1x1 Convolutions

Jul 10, 2018

Diederik P. Kingma, Prafulla Dhariwal

Figure 1 for Glow: Generative Flow with Invertible 1x1 Convolutions

Figure 2 for Glow: Generative Flow with Invertible 1x1 Convolutions

Figure 3 for Glow: Generative Flow with Invertible 1x1 Convolutions

Figure 4 for Glow: Generative Flow with Invertible 1x1 Convolutions

Abstract:Flow-based generative models (Dinh et al., 2014) are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis. In this paper we propose Glow, a simple type of generative flow using an invertible 1x1 convolution. Using our method we demonstrate a significant improvement in log-likelihood on standard benchmarks. Perhaps most strikingly, we demonstrate that a generative model optimized towards the plain log-likelihood objective is capable of efficient realistic-looking synthesis and manipulation of large images. The code for our model is available at https://github.com/openai/glow

* 15 pages; fixed typo in abstract

Via

Access Paper or Ask Questions

GamePad: A Learning Environment for Theorem Proving

Jun 02, 2018

Daniel Huang, Prafulla Dhariwal, Dawn Song, Ilya Sutskever

Figure 1 for GamePad: A Learning Environment for Theorem Proving

Figure 2 for GamePad: A Learning Environment for Theorem Proving

Abstract:In this paper, we introduce a system called GamePad that can be used to explore the application of machine learning methods to theorem proving in the Coq proof assistant. Interactive theorem provers such as Coq enable users to construct machine-checkable proofs in a step-by-step manner. Hence, they provide an opportunity to explore theorem proving at a human level of abstraction. We use GamePad to synthesize proofs for a simple algebraic rewrite problem and train baseline models for a formalization of the Feit-Thompson theorem. We address position evaluation (i.e., predict the number of proof steps left) and tactic prediction (i.e., predict the next proof step) tasks, which arise naturally in human-level theorem proving.

Via

Access Paper or Ask Questions

Parameter Space Noise for Exploration

Jan 31, 2018

Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, Marcin Andrychowicz

Figure 1 for Parameter Space Noise for Exploration

Figure 2 for Parameter Space Noise for Exploration

Figure 3 for Parameter Space Noise for Exploration

Figure 4 for Parameter Space Noise for Exploration

Abstract:Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. An alternative is to add noise directly to the agent's parameters, which can lead to more consistent exploration and a richer set of behaviors. Methods such as evolutionary strategies use parameter perturbations, but discard all temporal structure in the process and require significantly more samples. Combining parameter noise with traditional RL methods allows to combine the best of both worlds. We demonstrate that both off- and on-policy methods benefit from this approach through experimental comparison of DQN, DDPG, and TRPO on high-dimensional discrete action environments as well as continuous control tasks. Our results show that RL with parameter noise learns more efficiently than traditional RL with action space noise and evolutionary strategies individually.

* Updated to camera-ready ICLR submission

Via

Access Paper or Ask Questions

Proximal Policy Optimization Algorithms

Aug 28, 2017

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov

Figure 1 for Proximal Policy Optimization Algorithms

Figure 2 for Proximal Policy Optimization Algorithms

Figure 3 for Proximal Policy Optimization Algorithms

Figure 4 for Proximal Policy Optimization Algorithms

Abstract:We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent. Whereas standard policy gradient methods perform one gradient update per data sample, we propose a novel objective function that enables multiple epochs of minibatch updates. The new methods, which we call proximal policy optimization (PPO), have some of the benefits of trust region policy optimization (TRPO), but they are much simpler to implement, more general, and have better sample complexity (empirically). Our experiments test PPO on a collection of benchmark tasks, including simulated robotic locomotion and Atari game playing, and we show that PPO outperforms other online policy gradient methods, and overall strikes a favorable balance between sample complexity, simplicity, and wall-time.

Via

Access Paper or Ask Questions

Variational Lossy Autoencoder

Mar 04, 2017

Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, Pieter Abbeel

Figure 1 for Variational Lossy Autoencoder

Figure 2 for Variational Lossy Autoencoder

Figure 3 for Variational Lossy Autoencoder

Figure 4 for Variational Lossy Autoencoder

Abstract:Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only global structure and discards information about detailed texture. In this paper, we present a simple but principled method to learn such global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE model allows us to have control over what the global latent code can learn and , by designing the architecture accordingly, we can force the global latent code to discard irrelevant information such as texture in 2D images, and hence the VAE only "autoencodes" data in a lossy fashion. In addition, by leveraging autoregressive models as both prior distribution $p(z)$ and decoding distribution $p(x|z)$, we can greatly improve generative modeling performance of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and Caltech-101 Silhouettes density estimation tasks.

* Added CIFAR10 experiments; ICLR 2017

Via

Access Paper or Ask Questions