Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Practical tradeoffs between memory, compute, and performance in learned optimizers


Apr 01, 2022
Luke Metz, C. Daniel Freeman, James Harrison, Niru Maheswaranathan, Jascha Sohl-Dickstein


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Understanding How Encoder-Decoder Architectures Attend


Oct 28, 2021
Kyle Aitken, Vinay V Ramasesh, Yuan Cao, Niru Maheswaranathan

* 10+14 pages, 16 figures. NeurIPS 2021 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Training Learned Optimizers with Randomly Initialized Learned Optimizers


Jan 14, 2021
Luke Metz, C. Daniel Freeman, Niru Maheswaranathan, Jascha Sohl-Dickstein


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Reverse engineering learned optimizers reveals known and novel mechanisms


Nov 04, 2020
Niru Maheswaranathan, David Sussillo, Luke Metz, Ruoxi Sun, Jascha Sohl-Dickstein


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

The geometry of integration in text classification RNNs


Oct 28, 2020
Kyle Aitken, Vinay V. Ramasesh, Ankush Garg, Yuan Cao, David Sussillo, Niru Maheswaranathan

* 9+19 pages, 30 figures 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves


Sep 23, 2020
Luke Metz, Niru Maheswaranathan, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

How recurrent networks implement contextual processing in sentiment analysis


Apr 17, 2020
Niru Maheswaranathan, David Sussillo


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Using a thousand optimization tasks to learn hyperparameter search strategies


Mar 11, 2020
Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction


Dec 12, 2019
Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus, Surya Ganguli

* Neural Information Processing Systems (NeurIPS), 2019 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
>>