Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Jascha Sohl-Dickstein

Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping


Oct 05, 2021
James Martens, Andy Ballard, Guillaume Desjardins, Grzegorz Swirszcz, Valentin Dalibard, Jascha Sohl-Dickstein, Samuel S. Schoenholz


  Access Paper or Ask Questions

Training Learned Optimizers with Randomly Initialized Learned Optimizers


Jan 14, 2021
Luke Metz, C. Daniel Freeman, Niru Maheswaranathan, Jascha Sohl-Dickstein


  Access Paper or Ask Questions

Parallel Training of Deep Networks with Local Updates


Dec 07, 2020
Michael Laskin, Luke Metz, Seth Nabarrao, Mark Saroufim, Badreddine Noune, Carlo Luschi, Jascha Sohl-Dickstein, Pieter Abbeel

* First two authors - Michael Laskin and Luke Metz - contributed equally. Order was determined by a coin flip 

  Access Paper or Ask Questions

Score-Based Generative Modeling through Stochastic Differential Equations


Nov 26, 2020
Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole


  Access Paper or Ask Questions

Towards NNGP-guided Neural Architecture Search


Nov 11, 2020
Daniel S. Park, Jaehoon Lee, Daiyi Peng, Yuan Cao, Jascha Sohl-Dickstein

* 13 + 6 pages, 19 figures; open-source code available at https://github.com/google-research/google-research/tree/master/nngp_nas 

  Access Paper or Ask Questions

Reverse engineering learned optimizers reveals known and novel mechanisms


Nov 04, 2020
Niru Maheswaranathan, David Sussillo, Luke Metz, Ruoxi Sun, Jascha Sohl-Dickstein


  Access Paper or Ask Questions

Is Batch Norm unique? An empirical investigation and prescription to emulate the best properties of common normalizers without batch dependence


Oct 21, 2020
Vinay Rao, Jascha Sohl-Dickstein


  Access Paper or Ask Questions

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves


Sep 23, 2020
Luke Metz, Niru Maheswaranathan, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein


  Access Paper or Ask Questions

Finite Versus Infinite Neural Networks: an Empirical Study


Sep 08, 2020
Jaehoon Lee, Samuel S. Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein

* 17+11 pages; v2 references added, minor improvements 

  Access Paper or Ask Questions

Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible


Aug 25, 2020
Neha S. Wadia, Daniel Duckworth, Samuel S. Schoenholz, Ethan Dyer, Jascha Sohl-Dickstein

* 15+7 pages, 7 figures; added references, edited model descriptions for clarity, results unchanged 

  Access Paper or Ask Questions

A new method for parameter estimation in probabilistic models: Minimum probability flow


Jul 17, 2020
Jascha Sohl-Dickstein, Peter Battaglino, Michael R. DeWeese

* Originally published 2011. Uploaded to arXiv 2020. arXiv admin note: text overlap with arXiv:0906.4779, arXiv:1205.4295 

  Access Paper or Ask Questions

Exact posterior distributions of wide Bayesian neural networks


Jun 18, 2020
Jiri Hron, Yasaman Bahri, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein


  Access Paper or Ask Questions

Infinite attention: NNGP and NTK for deep attention networks


Jun 18, 2020
Jiri Hron, Yasaman Bahri, Jascha Sohl-Dickstein, Roman Novak

* ICML 2020 

  Access Paper or Ask Questions

Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling


Mar 24, 2020
Tong Che, Ruixiang Zhang, Jascha Sohl-Dickstein, Hugo Larochelle, Liam Paull, Yuan Cao, Yoshua Bengio


  Access Paper or Ask Questions

Using a thousand optimization tasks to learn hyperparameter search strategies


Mar 11, 2020
Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein


  Access Paper or Ask Questions

The large learning rate phase of deep learning: the catapult mechanism


Mar 04, 2020
Aitor Lewkowycz, Yasaman Bahri, Ethan Dyer, Jascha Sohl-Dickstein, Guy Gur-Ari

* 25 pages, 19 figures 

  Access Paper or Ask Questions

On the infinite width limit of neural networks with a standard parameterization


Jan 25, 2020
Jascha Sohl-Dickstein, Roman Novak, Samuel S. Schoenholz, Jaehoon Lee


  Access Paper or Ask Questions

Neural Tangents: Fast and Easy Infinite Neural Networks in Python


Dec 05, 2019
Roman Novak, Lechao Xiao, Jiri Hron, Jaehoon Lee, Alexander A. Alemi, Jascha Sohl-Dickstein, Samuel S. Schoenholz


  Access Paper or Ask Questions

Neural reparameterization improves structural optimization


Sep 14, 2019
Stephan Hoyer, Jascha Sohl-Dickstein, Sam Greydanus


  Access Paper or Ask Questions

Using learned optimizers to make models robust to input noise


Jun 08, 2019
Luke Metz, Niru Maheswaranathan, Jonathon Shlens, Jascha Sohl-Dickstein, Ekin D. Cubuk


  Access Paper or Ask Questions

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study


May 09, 2019
Daniel S. Park, Jascha Sohl-Dickstein, Quoc V. Le, Samuel L. Smith

* 17 pages, 3 tables, 17 figures; accepted to ICML 2019 

  Access Paper or Ask Questions

A RAD approach to deep mixture models


Mar 18, 2019
Laurent Dinh, Jascha Sohl-Dickstein, Razvan Pascanu, Hugo Larochelle

* 9 pages of main content, 4 pages of appendices 

  Access Paper or Ask Questions

A Mean Field Theory of Batch Normalization


Mar 05, 2019
Greg Yang, Jeffrey Pennington, Vinay Rao, Jascha Sohl-Dickstein, Samuel S. Schoenholz

* To appear in ICLR 2019 

  Access Paper or Ask Questions

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent


Feb 18, 2019
Jaehoon Lee, Lechao Xiao, Samuel S. Schoenholz, Yasaman Bahri, Jascha Sohl-Dickstein, Jeffrey Pennington

* 10+8 pages, 13 figures 

  Access Paper or Ask Questions