Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

Nov 12, 2020

Jack Parker-Holder, Luke Metz, Cinjon Resnick, Hengyuan Hu, Adam Lerer, Alistair Letcher, Alex Peysakhovich, Aldo Pacchiano, Jakob Foerster

Figure 1 for Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

Figure 2 for Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

Figure 3 for Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

Figure 4 for Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

Share this with someone who'll enjoy it:

Abstract:Over the last decade, a single algorithm has changed many facets of our lives - Stochastic Gradient Descent (SGD). In the era of ever decreasing loss functions, SGD and its various offspring have become the go-to optimization tool in machine learning and are a key component of the success of deep neural networks (DNNs). While SGD is guaranteed to converge to a local optimum (under loose assumptions), in some cases it may matter which local optimum is found, and this is often context-dependent. Examples frequently arise in machine learning, from shape-versus-texture-features to ensemble methods and zero-shot coordination. In these settings, there are desired solutions which SGD on 'standard' loss functions will not find, since it instead converges to the 'easy' solutions. In this paper, we present a different approach. Rather than following the gradient, which corresponds to a locally greedy direction, we instead follow the eigenvectors of the Hessian, which we call "ridges". By iteratively following and branching amongst the ridges, we effectively span the loss surface to find qualitatively different solutions. We show both theoretically and experimentally that our method, called Ridge Rider (RR), offers a promising direction for a variety of challenging problems.

* Camera-ready version, NeurIPS 2020

View paper on

Share this with someone who'll enjoy it:

Title:Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

Paper and Code