Alert button
Picture for Liuyi Yao

Liuyi Yao

Alert button

Efficient Personalized Federated Learning via Sparse Model-Adaptation

May 04, 2023
Daoyuan Chen, Liuyi Yao, Dawei Gao, Bolin Ding, Yaliang Li

Figure 1 for Efficient Personalized Federated Learning via Sparse Model-Adaptation
Figure 2 for Efficient Personalized Federated Learning via Sparse Model-Adaptation
Figure 3 for Efficient Personalized Federated Learning via Sparse Model-Adaptation
Figure 4 for Efficient Personalized Federated Learning via Sparse Model-Adaptation

Federated Learning (FL) aims to train machine learning models for multiple clients without sharing their own private data. Due to the heterogeneity of clients' local data distribution, recent studies explore the personalized FL that learns and deploys distinct local models with the help of auxiliary global models. However, the clients can be heterogeneous in terms of not only local data distribution, but also their computation and communication resources. The capacity and efficiency of personalized models are restricted by the lowest-resource clients, leading to sub-optimal performance and limited practicality of personalized FL. To overcome these challenges, we propose a novel approach named pFedGate for efficient personalized FL by adaptively and efficiently learning sparse local models. With a lightweight trainable gating layer, pFedGate enables clients to reach their full potential in model capacity by generating different sparse models accounting for both the heterogeneous data distributions and resource constraints. Meanwhile, the computation and communication efficiency are both improved thanks to the adaptability between the model sparsity and clients' resources. Further, we theoretically show that the proposed pFedGate has superior complexity with guaranteed convergence and generalization error. Extensive experiments show that pFedGate achieves superior global accuracy, individual accuracy and efficiency simultaneously over state-of-the-art methods. We also demonstrate that pFedGate performs better than competitors in the novel clients participation and partial clients participation scenarios, and can learn meaningful sparse local models adapted to different data distributions.

* Accepted to ICML 2023 
Viaarxiv icon

Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks

Feb 03, 2023
Zeyu Qin, Liuyi Yao, Daoyuan Chen, Yaliang Li, Bolin Ding, Minhao Cheng

Figure 1 for Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks
Figure 2 for Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks
Figure 3 for Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks
Figure 4 for Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks

In this work, besides improving prediction accuracy, we study whether personalization could bring robustness benefits to backdoor attacks. We conduct the first study of backdoor attacks in the pFL framework, testing 4 widely used backdoor attacks against 6 pFL methods on benchmark datasets FEMNIST and CIFAR-10, a total of 600 experiments. The study shows that pFL methods with partial model-sharing can significantly boost robustness against backdoor attacks. In contrast, pFL methods with full model-sharing do not show robustness. To analyze the reasons for varying robustness performances, we provide comprehensive ablation studies on different pFL methods. Based on our findings, we further propose a lightweight defense method, Simple-Tuning, which empirically improves defense performance against backdoor attacks. We believe that our work could provide both guidance for pFL application in terms of its robustness and offer valuable insights to design more robust FL methods in the future.

Viaarxiv icon

A Benchmark for Federated Hetero-Task Learning

Jun 21, 2022
Liuyi Yao, Dawei Gao, Zhen Wang, Yuexiang Xie, Weirui Kuang, Daoyuan Chen, Haohui Wang, Chenhe Dong, Bolin Ding, Yaliang Li

Figure 1 for A Benchmark for Federated Hetero-Task Learning
Figure 2 for A Benchmark for Federated Hetero-Task Learning
Figure 3 for A Benchmark for Federated Hetero-Task Learning
Figure 4 for A Benchmark for Federated Hetero-Task Learning

To investigate the heterogeneity in federated learning in real-world scenarios, we generalize the classic federated learning to federated hetero-task learning, which emphasizes the inconsistency across the participants in federated learning in terms of both data distribution and learning tasks. We also present B-FHTL, a federated hetero-task learning benchmark consisting of simulation dataset, FL protocols and a unified evaluation mechanism. B-FHTL dataset contains three well-designed federated learning tasks with increasing heterogeneity. Each task simulates the clients with different non-IID data and learning tasks. To ensure fair comparison among different FL algorithms, B-FHTL builds in a full suite of FL protocols by providing high-level APIs to avoid privacy leakage, and presets most common evaluation metrics spanning across different learning tasks, such as regression, classification, text generation and etc. Furthermore, we compare the FL algorithms in fields of federated multi-task learning, federated personalization and federated meta learning within B-FHTL, and highlight the influence of heterogeneity and difficulties of federated hetero-task learning. Our benchmark, including the federated dataset, protocols, the evaluation mechanism and the preliminary experiment, is open-sourced at https://github.com/alibaba/FederatedScope/tree/master/benchmark/B-FHTL

Viaarxiv icon

Federated Hetero-Task Learning

Jun 07, 2022
Liuyi Yao, Dawei Gao, Zhen Wang, Yuexiang Xie, Weirui Kuang, Daoyuan Chen, Haohui Wang, Chenhe Dong, Bolin Ding, Yaliang Li

Figure 1 for Federated Hetero-Task Learning
Figure 2 for Federated Hetero-Task Learning
Figure 3 for Federated Hetero-Task Learning
Figure 4 for Federated Hetero-Task Learning

To investigate the heterogeneity of federated learning in real-world scenarios, we generalize the classical federated learning to federated hetero-task learning, which emphasizes the inconsistency across the participants in federated learning in terms of both data distribution and learning tasks. We also present B-FHTL, a federated hetero-task learning benchmark consisted of simulation dataset, FL protocols and a unified evaluation mechanism. B-FHTL dataset contains three well-designed federated learning tasks with increasing heterogeneity. Each task simulates the clients with different data distributions and learning tasks. To ensure fair comparison among different FL algorithms, B-FHTL builds in a full suite of FL protocols by providing high-level APIs to avoid privacy leakage, and presets most common evaluation metrics spanning across different learning tasks, such as regression, classification, text generation and etc. Furthermore, we compare the FL algorithms in fields of federated multi-task learning, federated personalization and federated meta learning within B-FHTL, and highlight the influence of heterogeneity and difficulties of federated hetero-task learning. Our benchmark, including the federated dataset, protocols, the evaluation mechanism and the preliminary experiment, is open-sourced at https://github.com/alibaba/FederatedScope/tree/contest/v1.0.

Viaarxiv icon

FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning

Apr 14, 2022
Zhen Wang, Weirui Kuang, Yuexiang Xie, Liuyi Yao, Yaliang Li, Bolin Ding, Jingren Zhou

Figure 1 for FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning
Figure 2 for FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning
Figure 3 for FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning
Figure 4 for FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning

The incredible development of federated learning (FL) has benefited various tasks in the domains of computer vision and natural language processing, and the existing frameworks such as TFF and FATE has made the deployment easy in real-world applications. However, federated graph learning (FGL), even though graph data are prevalent, has not been well supported due to its unique characteristics and requirements. The lack of FGL-related framework increases the efforts for accomplishing reproducible research and deploying in real-world applications. Motivated by such strong demand, in this paper, we first discuss the challenges in creating an easy-to-use FGL package and accordingly present our implemented package FederatedScope-GNN (FS-G), which provides (1) a unified view for modularizing and expressing FGL algorithms; (2) comprehensive DataZoo and ModelZoo for out-of-the-box FGL capability; (3) an efficient model auto-tuning component; and (4) off-the-shelf privacy attack and defense abilities. We validate the effectiveness of FS-G by conducting extensive experiments, which simultaneously gains many valuable insights about FGL for the community. Moreover, we employ FS-G to serve the FGL application in real-world E-commerce scenarios, where the attained improvements indicate great potential business benefits. We publicly release FS-G, as submodules of FederatedScope, at https://github.com/alibaba/FederatedScope to promote FGL's research and enable broad applications that would otherwise be infeasible due to the lack of a dedicated package.

* We have released FederatedScope for users on https://github.com/alibaba/FederatedScope 
Viaarxiv icon

FederatedScope: A Comprehensive and Flexible Federated Learning Platform via Message Passing

Apr 11, 2022
Yuexiang Xie, Zhen Wang, Daoyuan Chen, Dawei Gao, Liuyi Yao, Weirui Kuang, Yaliang Li, Bolin Ding, Jingren Zhou

Figure 1 for FederatedScope: A Comprehensive and Flexible Federated Learning Platform via Message Passing
Figure 2 for FederatedScope: A Comprehensive and Flexible Federated Learning Platform via Message Passing
Figure 3 for FederatedScope: A Comprehensive and Flexible Federated Learning Platform via Message Passing
Figure 4 for FederatedScope: A Comprehensive and Flexible Federated Learning Platform via Message Passing

Although remarkable progress has been made by the existing federated learning (FL) platforms to provide fundamental functionalities for development, these FL platforms cannot well satisfy burgeoning demands from rapidly growing FL tasks in both academia and industry. To fill this gap, in this paper, we propose a novel and comprehensive federated learning platform, named FederatedScope, which is based on a message-oriented framework. Towards more handy and flexible support for various FL tasks, FederatedScope frames an FL course into several rounds of message passing among participants, and allows developers to customize new types of exchanged messages and the corresponding handlers for various FL applications. Compared to the procedural framework, the proposed message-oriented framework is more flexible to express heterogeneous message exchange and the rich behaviors of participants, and provides a unified view for both simulation and deployment. Besides, we also include several functional components in FederatedScope, such as personalization, auto-tuning, and privacy protection, to satisfy the requirements of frontier studies in FL. We conduct a series of experiments on the provided easy-to-use and comprehensive FL benchmarks to validate the correctness and efficiency of FederatedScope. We have released FederatedScope for users on https://github.com/alibaba/FederatedScope to promote research and industrial deployment of federated learning in a variety of real-world applications.

* We have released FederatedScope for users on https://github.com/alibaba/FederatedScope 
Viaarxiv icon

A Survey on Causal Inference

Feb 05, 2020
Liuyi Yao, Zhixuan Chu, Sheng Li, Yaliang Li, Jing Gao, Aidong Zhang

Figure 1 for A Survey on Causal Inference
Figure 2 for A Survey on Causal Inference
Figure 3 for A Survey on Causal Inference
Figure 4 for A Survey on Causal Inference

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

Viaarxiv icon

Finding Similar Medical Questions from Question Answering Websites

Oct 14, 2018
Yaliang Li, Liuyi Yao, Nan Du, Jing Gao, Qi Li, Chuishi Meng, Chenwei Zhang, Wei Fan

Figure 1 for Finding Similar Medical Questions from Question Answering Websites
Figure 2 for Finding Similar Medical Questions from Question Answering Websites
Figure 3 for Finding Similar Medical Questions from Question Answering Websites
Figure 4 for Finding Similar Medical Questions from Question Answering Websites

The past few years have witnessed the flourishing of crowdsourced medical question answering (Q&A) websites. Patients who have medical information demands tend to post questions about their health conditions on these crowdsourced Q&A websites and get answers from other users. However, we observe that a large portion of new medical questions cannot be answered in time or receive only few answers from these websites. On the other hand, we notice that solved questions have great potential to solve this challenge. Motivated by these, we propose an end-to-end system that can automatically find similar questions for unsolved medical questions. By learning the vector presentation of unsolved questions and their candidate similar questions, the proposed system outputs similar questions according to the similarity between vector representations. Through the vector representation, the similar questions are found at the question level, and the diversity of medical questions expression issue can be addressed. Further, we handle two more important issues, i.e., training data generation issue and efficiency issue, associated with the LSTM training procedure and the retrieval of candidate similar questions. The effectiveness of the proposed system is validated on a large-scale real-world dataset collected from a crowdsourced maternal-infant Q&A website.

Viaarxiv icon