Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Xiaoxia Wu

Hierarchical Learning for Generation with Long Source Sequences

Apr 15, 2021
Tobias Rohde, Xiaoxia Wu, Yinhan Liu

  Access Paper or Ask Questions

When Do Curricula Work?

Dec 05, 2020
Xiaoxia Wu, Ethan Dyer, Behnam Neyshabur

* ICLR 2021 

  Access Paper or Ask Questions

Choosing the Sample with Lowest Loss makes SGD Robust

Jan 10, 2020
Vatsal Shah, Xiaoxia Wu, Sujay Sanghavi

  Access Paper or Ask Questions

Implicit Regularization of Normalization Methods

Nov 23, 2019
Xiaoxia Wu, Edgar Dobriban, Tongzheng Ren, Shanshan Wu, Zhiyuan Li, Suriya Gunasekar, Rachel Ward, Qiang Liu

  Access Paper or Ask Questions

Linear Convergence of Adaptive Stochastic Gradient Descent

Aug 28, 2019
Yuege Xie, Xiaoxia Wu, Rachel Ward

  Access Paper or Ask Questions

Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network

Feb 19, 2019
Xiaoxia Wu, Simon S. Du, Rachel Ward

  Access Paper or Ask Questions

AdaGrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization

Jun 21, 2018
Rachel Ward, Xiaoxia Wu, Leon Bottou

* 17 pages, 3 figures 

  Access Paper or Ask Questions

WNGrad: Learn the Learning Rate in Gradient Descent

Mar 07, 2018
Xiaoxia Wu, Rachel Ward, Léon Bottou

* 10 pages, 3 figures, conference 

  Access Paper or Ask Questions