Alert button
Picture for Samuel L. Smith

Samuel L. Smith

Alert button

Characterizing signal propagation to close the performance gap in unnormalized ResNets

Add code
Bookmark button
Alert button
Jan 21, 2021
Andrew Brock, Soham De, Samuel L. Smith

Figure 1 for Characterizing signal propagation to close the performance gap in unnormalized ResNets
Figure 2 for Characterizing signal propagation to close the performance gap in unnormalized ResNets
Figure 3 for Characterizing signal propagation to close the performance gap in unnormalized ResNets
Figure 4 for Characterizing signal propagation to close the performance gap in unnormalized ResNets
Viaarxiv icon

Cold Posteriors and Aleatoric Uncertainty

Add code
Bookmark button
Alert button
Jul 31, 2020
Ben Adlam, Jasper Snoek, Samuel L. Smith

Figure 1 for Cold Posteriors and Aleatoric Uncertainty
Figure 2 for Cold Posteriors and Aleatoric Uncertainty
Figure 3 for Cold Posteriors and Aleatoric Uncertainty
Viaarxiv icon

On the Generalization Benefit of Noise in Stochastic Gradient Descent

Add code
Bookmark button
Alert button
Jun 26, 2020
Samuel L. Smith, Erich Elsen, Soham De

Figure 1 for On the Generalization Benefit of Noise in Stochastic Gradient Descent
Figure 2 for On the Generalization Benefit of Noise in Stochastic Gradient Descent
Figure 3 for On the Generalization Benefit of Noise in Stochastic Gradient Descent
Figure 4 for On the Generalization Benefit of Noise in Stochastic Gradient Descent
Viaarxiv icon

Batch Normalization Biases Deep Residual Networks Towards Shallow Paths

Add code
Bookmark button
Alert button
Feb 24, 2020
Soham De, Samuel L. Smith

Figure 1 for Batch Normalization Biases Deep Residual Networks Towards Shallow Paths
Figure 2 for Batch Normalization Biases Deep Residual Networks Towards Shallow Paths
Figure 3 for Batch Normalization Biases Deep Residual Networks Towards Shallow Paths
Figure 4 for Batch Normalization Biases Deep Residual Networks Towards Shallow Paths
Viaarxiv icon

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Add code
Bookmark button
Alert button
May 09, 2019
Daniel S. Park, Jascha Sohl-Dickstein, Quoc V. Le, Samuel L. Smith

Figure 1 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Figure 2 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Figure 3 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Figure 4 for The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Viaarxiv icon

Stochastic natural gradient descent draws posterior samples in function space

Add code
Bookmark button
Alert button
Oct 16, 2018
Samuel L. Smith, Daniel Duckworth, Semon Rezchikov, Quoc V. Le, Jascha Sohl-Dickstein

Figure 1 for Stochastic natural gradient descent draws posterior samples in function space
Figure 2 for Stochastic natural gradient descent draws posterior samples in function space
Figure 3 for Stochastic natural gradient descent draws posterior samples in function space
Figure 4 for Stochastic natural gradient descent draws posterior samples in function space
Viaarxiv icon

Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks

Add code
Bookmark button
Alert button
May 09, 2018
Vitalii Zhelezniak, Dan Busbridge, April Shen, Samuel L. Smith, Nils Y. Hammerla

Figure 1 for Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Figure 2 for Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Figure 3 for Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Figure 4 for Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Viaarxiv icon

Don't Decay the Learning Rate, Increase the Batch Size

Add code
Bookmark button
Alert button
Feb 24, 2018
Samuel L. Smith, Pieter-Jan Kindermans, Chris Ying, Quoc V. Le

Figure 1 for Don't Decay the Learning Rate, Increase the Batch Size
Figure 2 for Don't Decay the Learning Rate, Increase the Batch Size
Figure 3 for Don't Decay the Learning Rate, Increase the Batch Size
Figure 4 for Don't Decay the Learning Rate, Increase the Batch Size
Viaarxiv icon

A Bayesian Perspective on Generalization and Stochastic Gradient Descent

Add code
Bookmark button
Alert button
Feb 14, 2018
Samuel L. Smith, Quoc V. Le

Figure 1 for A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Figure 2 for A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Figure 3 for A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Figure 4 for A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Viaarxiv icon

Offline bilingual word vectors, orthogonal transformations and the inverted softmax

Add code
Bookmark button
Alert button
Feb 13, 2017
Samuel L. Smith, David H. P. Turban, Steven Hamblin, Nils Y. Hammerla

Figure 1 for Offline bilingual word vectors, orthogonal transformations and the inverted softmax
Figure 2 for Offline bilingual word vectors, orthogonal transformations and the inverted softmax
Figure 3 for Offline bilingual word vectors, orthogonal transformations and the inverted softmax
Viaarxiv icon