Picture for Samuel L. Smith

Samuel L. Smith

Cold Posteriors and Aleatoric Uncertainty

Add code
Jul 31, 2020
Figure 1 for Cold Posteriors and Aleatoric Uncertainty
Figure 2 for Cold Posteriors and Aleatoric Uncertainty
Figure 3 for Cold Posteriors and Aleatoric Uncertainty
Viaarxiv icon

On the Generalization Benefit of Noise in Stochastic Gradient Descent

Add code
Jun 26, 2020
Figure 1 for On the Generalization Benefit of Noise in Stochastic Gradient Descent
Figure 2 for On the Generalization Benefit of Noise in Stochastic Gradient Descent
Figure 3 for On the Generalization Benefit of Noise in Stochastic Gradient Descent
Figure 4 for On the Generalization Benefit of Noise in Stochastic Gradient Descent
Viaarxiv icon

Batch Normalization Biases Deep Residual Networks Towards Shallow Paths

Add code
Feb 24, 2020
Figure 1 for Batch Normalization Biases Deep Residual Networks Towards Shallow Paths
Figure 2 for Batch Normalization Biases Deep Residual Networks Towards Shallow Paths
Figure 3 for Batch Normalization Biases Deep Residual Networks Towards Shallow Paths
Figure 4 for Batch Normalization Biases Deep Residual Networks Towards Shallow Paths
Viaarxiv icon

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Add code
May 09, 2019
Figure 1 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Figure 2 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Figure 3 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Figure 4 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Viaarxiv icon

Stochastic natural gradient descent draws posterior samples in function space

Add code
Oct 16, 2018
Figure 1 for Stochastic natural gradient descent draws posterior samples in function space
Figure 2 for Stochastic natural gradient descent draws posterior samples in function space
Figure 3 for Stochastic natural gradient descent draws posterior samples in function space
Figure 4 for Stochastic natural gradient descent draws posterior samples in function space
Viaarxiv icon

Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks

Add code
May 09, 2018
Figure 1 for Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Figure 2 for Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Figure 3 for Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Figure 4 for Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Viaarxiv icon

Don't Decay the Learning Rate, Increase the Batch Size

Add code
Feb 24, 2018
Figure 1 for Don't Decay the Learning Rate, Increase the Batch Size
Figure 2 for Don't Decay the Learning Rate, Increase the Batch Size
Figure 3 for Don't Decay the Learning Rate, Increase the Batch Size
Figure 4 for Don't Decay the Learning Rate, Increase the Batch Size
Viaarxiv icon

A Bayesian Perspective on Generalization and Stochastic Gradient Descent

Add code
Feb 14, 2018
Figure 1 for A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Figure 2 for A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Figure 3 for A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Figure 4 for A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Viaarxiv icon

Offline bilingual word vectors, orthogonal transformations and the inverted softmax

Add code
Feb 13, 2017
Figure 1 for Offline bilingual word vectors, orthogonal transformations and the inverted softmax
Figure 2 for Offline bilingual word vectors, orthogonal transformations and the inverted softmax
Figure 3 for Offline bilingual word vectors, orthogonal transformations and the inverted softmax
Viaarxiv icon