Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox


A Boo(n) for Evaluating Architecture Performance

Jul 23, 2018
Ondrej Bajgar, Rudolf Kadlec, Jan Kleindienst



We point out important problems with the common practice of using the best single model performance for comparing deep learning architectures, and we propose a method that corrects these flaws. Each time a model is trained, one gets a different result due to random factors in the training process, which include random parameter initialization and random data shuffling. Reporting the best single model performance does not appropriately address this stochasticity. We propose a normalized expected best-out-of-$n$ performance ($\text{Boo}_n$) as a way to correct these problems.

* Proceedings of the 35th International Conference on Machine Learning (ICML 2018). Volume 80 of the Proceedings of Machine Learning Research (PMLR) 
* ICML 2018 


Share this with someone who'll enjoy it:

   Access Paper Source



Share this with someone who'll enjoy it: