Picture for Jascha Sohl-Dickstein

Jascha Sohl-Dickstein

Shammie

Score-Based Generative Modeling through Stochastic Differential Equations

Add code
Nov 26, 2020
Figure 1 for Score-Based Generative Modeling through Stochastic Differential Equations
Figure 2 for Score-Based Generative Modeling through Stochastic Differential Equations
Figure 3 for Score-Based Generative Modeling through Stochastic Differential Equations
Figure 4 for Score-Based Generative Modeling through Stochastic Differential Equations
Viaarxiv icon

Towards NNGP-guided Neural Architecture Search

Add code
Nov 11, 2020
Figure 1 for Towards NNGP-guided Neural Architecture Search
Figure 2 for Towards NNGP-guided Neural Architecture Search
Figure 3 for Towards NNGP-guided Neural Architecture Search
Figure 4 for Towards NNGP-guided Neural Architecture Search
Viaarxiv icon

Reverse engineering learned optimizers reveals known and novel mechanisms

Add code
Nov 04, 2020
Figure 1 for Reverse engineering learned optimizers reveals known and novel mechanisms
Figure 2 for Reverse engineering learned optimizers reveals known and novel mechanisms
Figure 3 for Reverse engineering learned optimizers reveals known and novel mechanisms
Figure 4 for Reverse engineering learned optimizers reveals known and novel mechanisms
Viaarxiv icon

Is Batch Norm unique? An empirical investigation and prescription to emulate the best properties of common normalizers without batch dependence

Add code
Oct 21, 2020
Figure 1 for Is Batch Norm unique? An empirical investigation and prescription to emulate the best properties of common normalizers without batch dependence
Figure 2 for Is Batch Norm unique? An empirical investigation and prescription to emulate the best properties of common normalizers without batch dependence
Figure 3 for Is Batch Norm unique? An empirical investigation and prescription to emulate the best properties of common normalizers without batch dependence
Figure 4 for Is Batch Norm unique? An empirical investigation and prescription to emulate the best properties of common normalizers without batch dependence
Viaarxiv icon

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves

Add code
Sep 23, 2020
Figure 1 for Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves
Figure 2 for Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves
Figure 3 for Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves
Figure 4 for Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves
Viaarxiv icon

Finite Versus Infinite Neural Networks: an Empirical Study

Add code
Sep 08, 2020
Figure 1 for Finite Versus Infinite Neural Networks: an Empirical Study
Figure 2 for Finite Versus Infinite Neural Networks: an Empirical Study
Figure 3 for Finite Versus Infinite Neural Networks: an Empirical Study
Figure 4 for Finite Versus Infinite Neural Networks: an Empirical Study
Viaarxiv icon

Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible

Add code
Aug 25, 2020
Figure 1 for Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible
Figure 2 for Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible
Figure 3 for Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible
Figure 4 for Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible
Viaarxiv icon

A new method for parameter estimation in probabilistic models: Minimum probability flow

Add code
Jul 17, 2020
Figure 1 for A new method for parameter estimation in probabilistic models: Minimum probability flow
Figure 2 for A new method for parameter estimation in probabilistic models: Minimum probability flow
Figure 3 for A new method for parameter estimation in probabilistic models: Minimum probability flow
Viaarxiv icon

Exact posterior distributions of wide Bayesian neural networks

Add code
Jun 18, 2020
Figure 1 for Exact posterior distributions of wide Bayesian neural networks
Figure 2 for Exact posterior distributions of wide Bayesian neural networks
Viaarxiv icon

Infinite attention: NNGP and NTK for deep attention networks

Add code
Jun 18, 2020
Figure 1 for Infinite attention: NNGP and NTK for deep attention networks
Figure 2 for Infinite attention: NNGP and NTK for deep attention networks
Figure 3 for Infinite attention: NNGP and NTK for deep attention networks
Figure 4 for Infinite attention: NNGP and NTK for deep attention networks
Viaarxiv icon