* ICLR 2018 * we find 99.9% of the gradient exchange in distributed SGD is
redundant; we reduce the communication bandwidth by two orders of magnitude
without losing accuracy Access Paper or Ask Questions
* Accepted as full paper in FPGA'17, Monterey, CA; Also appeared at 1st
International Workshop on Efficient Methods for Deep Neural Networks at NIPS
2016, Barcelona, Spain Access Paper or Ask Questions