Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Xiaoxia Wu

Hierarchical Learning for Generation with Long Source Sequences


Apr 15, 2021
Tobias Rohde, Xiaoxia Wu, Yinhan Liu


  Access Paper or Ask Questions

When Do Curricula Work?


Dec 05, 2020
Xiaoxia Wu, Ethan Dyer, Behnam Neyshabur

* ICLR 2021 

  Access Paper or Ask Questions

Choosing the Sample with Lowest Loss makes SGD Robust


Jan 10, 2020
Vatsal Shah, Xiaoxia Wu, Sujay Sanghavi


  Access Paper or Ask Questions

Implicit Regularization of Normalization Methods


Nov 23, 2019
Xiaoxia Wu, Edgar Dobriban, Tongzheng Ren, Shanshan Wu, Zhiyuan Li, Suriya Gunasekar, Rachel Ward, Qiang Liu


  Access Paper or Ask Questions

Linear Convergence of Adaptive Stochastic Gradient Descent


Aug 28, 2019
Yuege Xie, Xiaoxia Wu, Rachel Ward


  Access Paper or Ask Questions

Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network


Feb 19, 2019
Xiaoxia Wu, Simon S. Du, Rachel Ward


  Access Paper or Ask Questions

AdaGrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization


Jun 21, 2018
Rachel Ward, Xiaoxia Wu, Leon Bottou

* 17 pages, 3 figures 

  Access Paper or Ask Questions

WNGrad: Learn the Learning Rate in Gradient Descent


Mar 07, 2018
Xiaoxia Wu, Rachel Ward, Léon Bottou

* 10 pages, 3 figures, conference 

  Access Paper or Ask Questions