Get our free extension to see links to code for papers anywhere online!

# On the Convergence of FedAvg on Non-IID Data

Jul 04, 2019
Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, Zhihua Zhang

Federated learning enables a large amount of edge computing devices to learn a centralized model while keeping all local data on edge devices. As a leading algorithm in this setting, Federated Averaging (\texttt{FedAvg}) runs Stochastic Gradient Descent (SGD) in parallel on a small subset of the total devices and averages the sequences only once in a while. Despite its simplicity, it lacks theoretical guarantees in the federated setting. In this paper, we analyze the convergence of \texttt{FedAvg} on non-iid data. We investigate the effect of different sampling and averaging schemes, which are crucial especially when data are unbalanced. We prove a concise convergence rate of $\mathcal{O}(\frac{1}{T})$ for \texttt{FedAvg} with proper sampling and averaging schemes in convex problems, where $T$ is the total number of steps. Our results show that heterogeneity of data slows down the convergence, which is intrinsic in the federated setting. Low device participation rate can be achieved without severely harming the optimization process in federated learning. We show that there is a trade-off between communication efficiency and convergence rate. We analyze the necessity of learning rate decay by taking a linear regression as an example. Our work serves as a guideline for algorithm design in applications of federated learning, where heterogeneity and unbalance of data are the common case.