Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Orthogonal Over-Parameterized Training

Apr 09, 2020

Weiyang Liu, Rongmei Lin, Zhen Liu, James M. Rehg, Li Xiong, Le Song

Figure 1 for Orthogonal Over-Parameterized Training

Figure 2 for Orthogonal Over-Parameterized Training

Figure 3 for Orthogonal Over-Parameterized Training

Figure 4 for Orthogonal Over-Parameterized Training

Share this with someone who'll enjoy it:

Abstract:The inductive bias of a neural network is largely determined by the architecture and the training algorithm. To achieve good generalization, how to effectively train a neural network is even more important than designing the architecture. We propose a novel orthogonal over-parameterized training (OPT) framework that can provably minimize the hyperspherical energy which characterizes the diversity of neurons on a hypersphere. By constantly maintaining the minimum hyperspherical energy during training, OPT can greatly improve the network generalization. Specifically, OPT fixes the randomly initialized weights of the neurons and learns an orthogonal transformation that applies to these neurons. We propose multiple ways to learn such an orthogonal transformation, including unrolling orthogonalization algorithms, applying orthogonal parameterization, and designing orthogonality-preserving gradient update. Interestingly, OPT reveals that learning a proper coordinate system for neurons is crucial to generalization and may be more important than learning a specific relative position of neurons. We further provide theoretical insights of why OPT yields better generalization. Extensive experiments validate the superiority of OPT.

* Technical Report

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Orthogonal Over-Parameterized Training

Paper and Code