Alert button
Picture for Daniel J. Beutel

Daniel J. Beutel

Alert button

Gradient-less Federated Gradient Boosting Trees with Learnable Learning Rates

May 02, 2023
Chenyang Ma, Xinchi Qiu, Daniel J. Beutel, Nicholas D. Lane

Figure 1 for Gradient-less Federated Gradient Boosting Trees with Learnable Learning Rates
Figure 2 for Gradient-less Federated Gradient Boosting Trees with Learnable Learning Rates
Figure 3 for Gradient-less Federated Gradient Boosting Trees with Learnable Learning Rates
Figure 4 for Gradient-less Federated Gradient Boosting Trees with Learnable Learning Rates

The privacy-sensitive nature of decentralized datasets and the robustness of eXtreme Gradient Boosting (XGBoost) on tabular data raise the needs to train XGBoost in the context of federated learning (FL). Existing works on federated XGBoost in the horizontal setting rely on the sharing of gradients, which induce per-node level communication frequency and serious privacy concerns. To alleviate these problems, we develop an innovative framework for horizontal federated XGBoost which does not depend on the sharing of gradients and simultaneously boosts privacy and communication efficiency by making the learning rates of the aggregated tree ensembles learnable. We conduct extensive evaluations on various classification and regression datasets, showing our approach achieves performance comparable to the state-of-the-art method and effectively improves communication efficiency by lowering both communication rounds and communication overhead by factors ranging from 25x to 700x.

* Accepted at the 3rd ACM Workshop on Machine Learning and Systems (EuroMLSys), May 8th 2023, Rome, Italy 
Viaarxiv icon

Secure Aggregation for Federated Learning in Flower

May 12, 2022
Kwing Hei Li, Pedro Porto Buarque de Gusmão, Daniel J. Beutel, Nicholas D. Lane

Figure 1 for Secure Aggregation for Federated Learning in Flower
Figure 2 for Secure Aggregation for Federated Learning in Flower
Figure 3 for Secure Aggregation for Federated Learning in Flower
Figure 4 for Secure Aggregation for Federated Learning in Flower

Federated Learning (FL) allows parties to learn a shared prediction model by delegating the training computation to clients and aggregating all the separately trained models on the server. To prevent private information being inferred from local models, Secure Aggregation (SA) protocols are used to ensure that the server is unable to inspect individual trained models as it aggregates them. However, current implementations of SA in FL frameworks have limitations, including vulnerability to client dropouts or configuration difficulties. In this paper, we present Salvia, an implementation of SA for Python users in the Flower FL framework. Based on the SecAgg(+) protocols for a semi-honest threat model, Salvia is robust against client dropouts and exposes a flexible and easy-to-use API that is compatible with various machine learning frameworks. We show that Salvia's experimental performance is consistent with SecAgg(+)'s theoretical computation and communication complexities.

* Accepted to appear in the 2nd International Workshop on Distributed Machine Learning 
Viaarxiv icon

MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Oct 08, 2021
Alexandros Karargyris, Renato Umeton, Micah J. Sheller, Alejandro Aristizabal, Johnu George, Srini Bala, Daniel J. Beutel, Victor Bittorf, Akshay Chaudhari, Alexander Chowdhury, Cody Coleman, Bala Desinghu, Gregory Diamos, Debo Dutta, Diane Feddema, Grigori Fursin, Junyi Guo, Xinyuan Huang, David Kanter, Satyananda Kashyap, Nicholas Lane, Indranil Mallick, Pietro Mascagni, Virendra Mehta, Vivek Natarajan, Nikola Nikolov, Nicolas Padoy, Gennady Pekhimenko, Vijay Janapa Reddi, G Anthony Reina, Pablo Ribalta, Jacob Rosenthal, Abhishek Singh, Jayaraman J. Thiagarajan, Anna Wuest, Maria Xenochristou, Daguang Xu, Poonam Yadav, Michael Rosenthal, Massimo Loda, Jason M. Johnson, Peter Mattson

Figure 1 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation
Figure 2 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation
Figure 3 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation
Figure 4 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience. We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. To meet this need, we are building MedPerf, an open framework for benchmarking machine learning in the medical domain. MedPerf will enable federated evaluation in which models are securely distributed to different facilities for evaluation, thereby empowering healthcare organizations to assess and verify the performance of AI models in an efficient and human-supervised process, while prioritizing privacy. We describe the current challenges healthcare and AI communities face, the need for an open platform, the design philosophy of MedPerf, its current implementation status, and our roadmap. We call for researchers and organizations to join us in creating the MedPerf open benchmarking platform.

Viaarxiv icon

End-to-End Speech Recognition from Federated Acoustic Models

Apr 29, 2021
Yan Gao, Titouan Parcollet, Javier Fernandez-Marques, Pedro P. B. de Gusmao, Daniel J. Beutel, Nicholas D. Lane

Figure 1 for End-to-End Speech Recognition from Federated Acoustic Models
Figure 2 for End-to-End Speech Recognition from Federated Acoustic Models
Figure 3 for End-to-End Speech Recognition from Federated Acoustic Models

Training Automatic Speech Recognition (ASR) models under federated learning (FL) settings has recently attracted considerable attention. However, the FL scenarios often presented in the literature are artificial and fail to capture the complexity of real FL systems. In this paper, we construct a challenging and realistic ASR federated experimental setup consisting of clients with heterogeneous data distributions using the French Common Voice dataset, a large heterogeneous dataset containing over 10k speakers. We present the first empirical study on attention-based sequence-to-sequence E2E ASR model with three aggregation weighting strategies -- standard FedAvg, loss-based aggregation and a novel word error rate (WER)-based aggregation, are conducted in two realistic FL scenarios: cross-silo with 10-clients and cross-device with 2k-clients. In particular, the WER-based weighting method is proposed to better adapt FL to the context of ASR by integrating the error rate metric with the aggregation process. Our analysis on E2E ASR from heterogeneous and realistic federated acoustic models provides the foundations for future research and development of realistic FL-based ASR applications.

Viaarxiv icon

On-device Federated Learning with Flower

Apr 07, 2021
Akhil Mathur, Daniel J. Beutel, Pedro Porto Buarque de Gusmão, Javier Fernandez-Marques, Taner Topal, Xinchi Qiu, Titouan Parcollet, Yan Gao, Nicholas D. Lane

Figure 1 for On-device Federated Learning with Flower
Figure 2 for On-device Federated Learning with Flower
Figure 3 for On-device Federated Learning with Flower
Figure 4 for On-device Federated Learning with Flower

Federated Learning (FL) allows edge devices to collaboratively learn a shared prediction model while keeping their training data on the device, thereby decoupling the ability to do machine learning from the need to store data in the cloud. Despite the algorithmic advancements in FL, the support for on-device training of FL algorithms on edge devices remains poor. In this paper, we present an exploration of on-device FL on various smartphones and embedded devices using the Flower framework. We also evaluate the system costs of on-device FL and discuss how this quantification could be used to design more efficient FL algorithms.

* On-device Intelligence Workshop at the Fourth Conference on Machine Learning and Systems (MLSys), April 9, 2021  
* Accepted at the 2nd On-device Intelligence Workshop @ MLSys 2021. arXiv admin note: substantial text overlap with arXiv:2007.14390 
Viaarxiv icon

A first look into the carbon footprint of federated learning

Feb 15, 2021
Xinchi Qiu, Titouan Parcollet, Javier Fernandez-Marques, Pedro Porto Buarque de Gusmao, Daniel J. Beutel, Taner Topal, Akhil Mathur, Nicholas D. Lane

Figure 1 for A first look into the carbon footprint of federated learning
Figure 2 for A first look into the carbon footprint of federated learning
Figure 3 for A first look into the carbon footprint of federated learning
Figure 4 for A first look into the carbon footprint of federated learning

Despite impressive results, deep learning-based technologies also raise severe privacy and environmental concerns induced by the training procedure often conducted in datacenters. In response, alternatives to centralized training such as Federated Learning (FL) have emerged. Perhaps unexpectedly, FL, in particular, is starting to be deployed at a global scale by companies that must adhere to new legal demands and policies originating from governments and civil society for privacy protection. However, the potential environmental impact related to FL remains unclear and unexplored. This paper offers the first-ever systematic study of the carbon footprint of FL. First, we propose a rigorous model to quantify the carbon footprint, hence facilitating the investigation of the relationship between FL design and carbon emissions. Then, we compare the carbon footprint of FL to traditional centralized learning. Our findings show that FL, despite being slower to converge in some cases, may result in a comparatively greener impact than a centralized equivalent setup. We performed extensive experiments across different types of datasets, settings, and various deep learning models with FL. Finally, we highlight and connect the reported results to the future challenges and trends in FL to reduce its environmental impact, including algorithms efficiency, hardware capabilities, and stronger industry transparency.

* arXiv admin note: substantial text overlap with arXiv:2010.06537 
Viaarxiv icon

Flower: A Friendly Federated Learning Research Framework

Jul 28, 2020
Daniel J. Beutel, Taner Topal, Akhil Mathur, Xinchi Qiu, Titouan Parcollet, Nicholas D. Lane

Figure 1 for Flower: A Friendly Federated Learning Research Framework
Figure 2 for Flower: A Friendly Federated Learning Research Framework
Figure 3 for Flower: A Friendly Federated Learning Research Framework
Figure 4 for Flower: A Friendly Federated Learning Research Framework

Federated Learning (FL) has emerged as a promising technique for edge devices to collaboratively learn a shared prediction model, while keeping their training data on the device, thereby decoupling the ability to do machine learning from the need to store the data in the cloud. However, FL is difficult to implement and deploy in practice, considering the heterogeneity in mobile devices, e.g., different programming languages, frameworks, and hardware accelerators. Although there are a few frameworks available to simulate FL algorithms (e.g., TensorFlow Federated), they do not support implementing FL workloads on mobile devices. Furthermore, these frameworks are designed to simulate FL in a server environment and hence do not allow experimentation in distributed mobile settings for a large number of clients. In this paper, we present Flower (https://flower.dev/), a FL framework which is both agnostic towards heterogeneous client environments and also scales to a large number of clients, including mobile and embedded devices. Flower's abstractions let developers port existing mobile workloads with little overhead, regardless of the programming language or ML framework used, while also allowing researchers flexibility to experiment with novel approaches to advance the state-of-the-art. We describe the design goals and implementation considerations of Flower and show our experiences in evaluating the performance of FL across clients with heterogeneous computational and communication capabilities.

* Open-Source, mobile-friendly Federated Learning framework 
Viaarxiv icon