Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed

Theoretical Understanding of Batch-normalization: A Markov Chain Perspective

Mar 03, 2020
Hadi Daneshmand, Jonas Kohler, Francis Bach, Thomas Hofmann, Aurelien Lucchi

Batch-normalization (BN) is a key component to effectively train deep neural networks. Empirical evidence has shown that without BN, the training process is prone to unstabilities. This is however not well understood from a theoretical point of view. Leveraging tools from Markov chain theory, we show that BN has a direct effect on the rank of the pre-activation matrices of a neural network. Specifically, while deep networks without BN exhibit rank collapse and poor training performance, networks equipped with BN have a higher rank. In an extensive set of experiments on standard neural network architectures and datasets, we show that the latter quantity is a good predictor for the optimization speed of training.

Share this with someone who'll enjoy it:

   Access Paper Source

Share this with someone who'll enjoy it: