Picture for Xiangru Lian

Xiangru Lian

Revisit Batch Normalization: New Understanding from an Optimization View and a Refinement via Composition Optimization

Add code
Oct 15, 2018
Figure 1 for Revisit Batch Normalization: New Understanding from an Optimization View and a Refinement via Composition Optimization
Figure 2 for Revisit Batch Normalization: New Understanding from an Optimization View and a Refinement via Composition Optimization
Figure 3 for Revisit Batch Normalization: New Understanding from an Optimization View and a Refinement via Composition Optimization
Figure 4 for Revisit Batch Normalization: New Understanding from an Optimization View and a Refinement via Composition Optimization
Viaarxiv icon

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Add code
Sep 25, 2018
Figure 1 for Asynchronous Decentralized Parallel Stochastic Gradient Descent
Figure 2 for Asynchronous Decentralized Parallel Stochastic Gradient Descent
Figure 3 for Asynchronous Decentralized Parallel Stochastic Gradient Descent
Figure 4 for Asynchronous Decentralized Parallel Stochastic Gradient Descent
Viaarxiv icon

D$^2$: Decentralized Training over Decentralized Data

Add code
Apr 20, 2018
Figure 1 for D$^2$: Decentralized Training over Decentralized Data
Figure 2 for D$^2$: Decentralized Training over Decentralized Data
Viaarxiv icon

Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent

Add code
Sep 11, 2017
Figure 1 for Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent
Figure 2 for Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent
Viaarxiv icon

Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization

Add code
Jun 10, 2017
Figure 1 for Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Figure 2 for Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Figure 3 for Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Figure 4 for Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Viaarxiv icon

Staleness-aware Async-SGD for Distributed Deep Learning

Add code
Apr 05, 2016
Figure 1 for Staleness-aware Async-SGD for Distributed Deep Learning
Figure 2 for Staleness-aware Async-SGD for Distributed Deep Learning
Figure 3 for Staleness-aware Async-SGD for Distributed Deep Learning
Figure 4 for Staleness-aware Async-SGD for Distributed Deep Learning
Viaarxiv icon