Alert button

Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective

Feb 13, 2024
Wu Lin, Felix Dangel, Runa Eschenhagen, Juhan Bae, Richard E. Turner, Alireza Makhzani

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: