Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A jamming transition from under- to over-parametrization affects loss landscape and generalization

Oct 22, 2018

Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart

Figure 1 for A jamming transition from under- to over-parametrization affects loss landscape and generalization

Figure 2 for A jamming transition from under- to over-parametrization affects loss landscape and generalization

Figure 3 for A jamming transition from under- to over-parametrization affects loss landscape and generalization

Figure 4 for A jamming transition from under- to over-parametrization affects loss landscape and generalization

Share this with someone who'll enjoy it:

Abstract:We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. Under some general conditions, we show that this transition is sharp for the hinge loss. In the whole over-parametrized regime, poor minima of the loss are not encountered during training since the number of constraints to satisfy is too small to hamper minimization. Our findings support a link between this transition and the generalization properties of the network: as we increase the number of parameters of a given model, starting from an under-parametrized network, we observe that the generalization error displays three phases: (i) initial decay, (ii) increase until the transition point --- where it displays a cusp --- and (iii) power law decay toward a constant for the rest of the over-parametrized regime. Thereby we identify the region where the classical phenomenon of over-fitting takes place, and the region where the model keeps improving, in line with previous empirical observations for modern neural networks. The theoretical results presented here appeared elsewhere for a physics audience. The results on generalization are new.

* 11 pages, 6 figures, submitted to NIPS workshop "Integration of Deep Learning Theories". arXiv admin note: substantial text overlap with arXiv:1809.09349

View paper on

Share this with someone who'll enjoy it:

Title:A jamming transition from under- to over-parametrization affects loss landscape and generalization

Paper and Code