Alert button
Picture for Min Liu

Min Liu

Alert button

Federated Skewed Label Learning with Logits Fusion

Nov 14, 2023
Yuwei Wang, Runhan Li, Hao Tan, Xuefeng Jiang, Sheng Sun, Min Liu, Bo Gao, Zhiyuan Wu

Federated learning (FL) aims to collaboratively train a shared model across multiple clients without transmitting their local data. Data heterogeneity is a critical challenge in realistic FL settings, as it causes significant performance deterioration due to discrepancies in optimization among local models. In this work, we focus on label distribution skew, a common scenario in data heterogeneity, where the data label categories are imbalanced on each client. To address this issue, we propose FedBalance, which corrects the optimization bias among local models by calibrating their logits. Specifically, we introduce an extra private weak learner on the client side, which forms an ensemble model with the local model. By fusing the logits of the two models, the private weak learner can capture the variance of different data, regardless of their category. Therefore, the optimization direction of local models can be improved by increasing the penalty for misclassifying minority classes and reducing the attention to majority classes, resulting in a better global model. Extensive experiments show that our method can gain 13\% higher average accuracy compared with state-of-the-art methods.

* 9 pages, 4 figures, 4 tables 
Viaarxiv icon

Beating Backdoor Attack at Its Own Game

Aug 04, 2023
Min Liu, Alberto Sangiovanni-Vincentelli, Xiangyu Yue

Figure 1 for Beating Backdoor Attack at Its Own Game
Figure 2 for Beating Backdoor Attack at Its Own Game
Figure 3 for Beating Backdoor Attack at Its Own Game
Figure 4 for Beating Backdoor Attack at Its Own Game

Deep neural networks (DNNs) are vulnerable to backdoor attack, which does not affect the network's performance on clean data but would manipulate the network behavior once a trigger pattern is added. Existing defense methods have greatly reduced attack success rate, but their prediction accuracy on clean data still lags behind a clean model by a large margin. Inspired by the stealthiness and effectiveness of backdoor attack, we propose a simple but highly effective defense framework which injects non-adversarial backdoors targeting poisoned samples. Following the general steps in backdoor attack, we detect a small set of suspected samples and then apply a poisoning strategy to them. The non-adversarial backdoor, once triggered, suppresses the attacker's backdoor on poisoned data, but has limited influence on clean data. The defense can be carried out during data preprocessing, without any modification to the standard end-to-end training pipeline. We conduct extensive experiments on multiple benchmarks with different architectures and representative attacks. Results demonstrate that our method achieves state-of-the-art defense effectiveness with by far the lowest performance drop on clean data. Considering the surprising defense ability displayed by our framework, we call for more attention to utilizing backdoor for backdoor defense. Code is available at https://github.com/damianliumin/non-adversarial_backdoor.

* Accepted to ICCV 2023 
Viaarxiv icon

FedBIAD: Communication-Efficient and Accuracy-Guaranteed Federated Learning with Bayesian Inference-Based Adaptive Dropout

Jul 14, 2023
Jingjing Xue, Min Liu, Sheng Sun, Yuwei Wang, Hui Jiang, Xuefeng Jiang

Figure 1 for FedBIAD: Communication-Efficient and Accuracy-Guaranteed Federated Learning with Bayesian Inference-Based Adaptive Dropout
Figure 2 for FedBIAD: Communication-Efficient and Accuracy-Guaranteed Federated Learning with Bayesian Inference-Based Adaptive Dropout
Figure 3 for FedBIAD: Communication-Efficient and Accuracy-Guaranteed Federated Learning with Bayesian Inference-Based Adaptive Dropout
Figure 4 for FedBIAD: Communication-Efficient and Accuracy-Guaranteed Federated Learning with Bayesian Inference-Based Adaptive Dropout

Federated Learning (FL) emerges as a distributed machine learning paradigm without end-user data transmission, effectively avoiding privacy leakage. Participating devices in FL are usually bandwidth-constrained, and the uplink is much slower than the downlink in wireless networks, which causes a severe uplink communication bottleneck. A prominent direction to alleviate this problem is federated dropout, which drops fractional weights of local models. However, existing federated dropout studies focus on random or ordered dropout and lack theoretical support, resulting in unguaranteed performance. In this paper, we propose Federated learning with Bayesian Inference-based Adaptive Dropout (FedBIAD), which regards weight rows of local models as probability distributions and adaptively drops partial weight rows based on importance indicators correlated with the trend of local training loss. By applying FedBIAD, each client adaptively selects a high-quality dropping pattern with accurate approximations and only transmits parameters of non-dropped weight rows to mitigate uplink costs while improving accuracy. Theoretical analysis demonstrates that the convergence rate of the average generalization error of FedBIAD is minimax optimal up to a squared logarithmic factor. Extensive experiments on image classification and next-word prediction show that compared with status quo approaches, FedBIAD provides 2x uplink reduction with an accuracy increase of up to 2.41% even on non-Independent and Identically Distributed (non-IID) data, which brings up to 72% decrease in training time.

* 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)  
Viaarxiv icon

Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation

Mar 31, 2023
Min Liu, Yu Bao, Chengqi Zhao, Shujian Huang

Figure 1 for Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation
Figure 2 for Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation
Figure 3 for Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation
Figure 4 for Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation

Benefiting from the sequence-level knowledge distillation, the Non-Autoregressive Transformer (NAT) achieves great success in neural machine translation tasks. However, existing knowledge distillation has side effects, such as propagating errors from the teacher to NAT students, which may limit further improvements of NAT models and are rarely discussed in existing research. In this paper, we introduce selective knowledge distillation by introducing an NAT evaluator to select NAT-friendly targets that are of high quality and easy to learn. In addition, we introduce a simple yet effective progressive distillation method to boost NAT performance. Experiment results on multiple WMT language directions and several representative NAT models show that our approach can realize a flexible trade-off between the quality and complexity of training data for NAT models, achieving strong performances. Further analysis shows that distilling only 5% of the raw translations can help an NAT outperform its counterpart trained on raw data by about 2.4 BLEU.

Viaarxiv icon

Online Spatio-Temporal Correlation-Based Federated Learning for Traffic Flow Forecasting

Feb 17, 2023
Qingxiang Liu, Sheng Sun, Min Liu, Yuwei Wang, Bo Gao

Figure 1 for Online Spatio-Temporal Correlation-Based Federated Learning for Traffic Flow Forecasting
Figure 2 for Online Spatio-Temporal Correlation-Based Federated Learning for Traffic Flow Forecasting
Figure 3 for Online Spatio-Temporal Correlation-Based Federated Learning for Traffic Flow Forecasting
Figure 4 for Online Spatio-Temporal Correlation-Based Federated Learning for Traffic Flow Forecasting

Traffic flow forecasting (TFF) is of great importance to the construction of Intelligent Transportation Systems (ITS). To mitigate communication burden and tackle with the problem of privacy leakage aroused by centralized forecasting methods, Federated Learning (FL) has been applied to TFF. However, existing FL-based approaches employ batch learning manner, which makes the pre-trained models inapplicable to subsequent traffic data, thus exhibiting subpar prediction performance. In this paper, we perform the first study of forecasting traffic flow adopting Online Learning (OL) manner in FL framework and then propose a novel prediction method named Online Spatio-Temporal Correlation-based Federated Learning (FedOSTC), aiming to guarantee performance gains regardless of traffic fluctuation. Specifically, clients employ Gated Recurrent Unit (GRU)-based encoders to obtain the internal temporal patterns inside traffic data sequences. Then, the central server evaluates spatial correlation among clients via Graph Attention Network (GAT), catering to the dynamic changes of spatial closeness caused by traffic fluctuation. Furthermore, to improve the generalization of the global model for upcoming traffic data, a period-aware aggregation mechanism is proposed to aggregate the local models which are optimized using Online Gradient Descent (OGD) algorithm at clients. We perform comprehensive experiments on two real-world datasets to validate the efficiency and effectiveness of our proposed method and the numerical results demonstrate the superiority of FedOSTC.

Viaarxiv icon

Survey of Knowledge Distillation in Federated Edge Learning

Jan 14, 2023
Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Xuefeng Jiang, Runhan Li

Figure 1 for Survey of Knowledge Distillation in Federated Edge Learning

The increasing demand for intelligent services and privacy protection of mobile and Internet of Things (IoT) devices motivates the wide application of Federated Edge Learning (FEL), in which devices collaboratively train on-device Machine Learning (ML) models without sharing their private data. \textcolor{black}{Limited by device hardware, diverse user behaviors and network infrastructure, the algorithm design of FEL faces challenges related to resources, personalization and network environments}, and Knowledge Distillation (KD) has been leveraged as an important technique to tackle the above challenges in FEL. In this paper, we investigate the works that KD applies to FEL, discuss the limitations and open problems of existing KD-based FEL approaches, and provide guidance for their real deployment.

* 12 pages, 1 table 
Viaarxiv icon

A Personalized Utterance Style (PUS) based Dialogue Strategy for Efficient Service Requirement Elicitation

Jan 07, 2023
Demin Yu, Min Liu, Zhongjie Wang

Figure 1 for A Personalized Utterance Style (PUS) based Dialogue Strategy for Efficient Service Requirement Elicitation
Figure 2 for A Personalized Utterance Style (PUS) based Dialogue Strategy for Efficient Service Requirement Elicitation
Figure 3 for A Personalized Utterance Style (PUS) based Dialogue Strategy for Efficient Service Requirement Elicitation
Figure 4 for A Personalized Utterance Style (PUS) based Dialogue Strategy for Efficient Service Requirement Elicitation

With the flourish of services on the Internet, a prerequisite for service providers to precisely deliver services to their customers is to capture user requirements comprehensively, accurately, and efficiently. This is called the ``Service Requirement Elicitation (SRE)'' task. Considering the amount of customers is huge, it is an inefficient way for service providers to interact with each user by face-to-face dialog. Therefore, to elicit user requirements with the assistance of virtual intelligent assistants has become a mainstream way. Since user requirements generally consist of different levels of details and need to be satisfied by services from multiple domains, there is a huge potential requirement space for SRE to explore to elicit complete requirements. Considering that traditional dialogue system with static slots cannot be directly applied to the SRE task, it is a challenge to design an efficient dialogue strategy to guide users to express their complete and accurate requirements in such a huge potential requirement space. Based on the phenomenon that users tend to express requirements subjectively in a sequential manner, we propose a Personalized Utterance Style (PUS) module to perceive the personalized requirement expression habits, and then apply PUS to an dialogue strategy to efficiently complete the SRE task. Specifically, the dialogue strategy chooses suitable response actions for dynamically updating the dialogue state. With the assistance of PUS extracted from dialogue history, the system can shrink the search scope of potential requirement space. Experiment results show that the dialogue strategy with PUS can elicit more accurate user requirements with fewer dialogue rounds.

Viaarxiv icon

FedICT: Federated Multi-task Distillation for Multi-access Edge Computing

Jan 01, 2023
Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Xuefeng Jiang, Bo Gao

Figure 1 for FedICT: Federated Multi-task Distillation for Multi-access Edge Computing
Figure 2 for FedICT: Federated Multi-task Distillation for Multi-access Edge Computing
Figure 3 for FedICT: Federated Multi-task Distillation for Multi-access Edge Computing
Figure 4 for FedICT: Federated Multi-task Distillation for Multi-access Edge Computing

The growing interest in intelligent services and privacy protection for mobile devices has given rise to the widespread application of federated learning in Multi-access Edge Computing (MEC). Diverse user behaviors call for personalized services with heterogeneous Machine Learning (ML) models on different devices. Federated Multi-task Learning (FMTL) is proposed to train related but personalized ML models for different devices, whereas previous works suffer from excessive communication overhead during training and neglect the model heterogeneity among devices in MEC. Introducing knowledge distillation into FMTL can simultaneously enable efficient communication and model heterogeneity among clients, whereas existing methods rely on a public dataset, which is impractical in reality. To tackle this dilemma, Federated MultI-task Distillation for Multi-access Edge CompuTing (FedICT) is proposed. FedICT direct local-global knowledge aloof during bi-directional distillation processes between clients and the server, aiming to enable multi-task clients while alleviating client drift derived from divergent optimization directions of client-side local models. Specifically, FedICT includes Federated Prior Knowledge Distillation (FPKD) and Local Knowledge Adjustment (LKA). FPKD is proposed to reinforce the clients' fitting of local data by introducing prior knowledge of local data distributions. Moreover, LKA is proposed to correct the distillation loss of the server, making the transferred local knowledge better match the generalized representation. Experiments on three datasets show that FedICT significantly outperforms all compared benchmarks in various data heterogeneous and model architecture settings, achieving improved accuracy with less than 1.2% training communication overhead compared with FedAvg and no more than 75% training communication round compared with FedGKT.

* 15 pages, 4 figures, 9 tables 
Viaarxiv icon

THUEE system description for NIST 2020 SRE CTS challenge

Oct 12, 2022
Yu Zheng, Jinghan Peng, Miao Zhao, Yufeng Ma, Min Liu, Xinyue Ma, Tianyu Liang, Tianlong Kong, Liang He, Minqiang Xu

Figure 1 for THUEE system description for NIST 2020 SRE CTS challenge
Figure 2 for THUEE system description for NIST 2020 SRE CTS challenge
Figure 3 for THUEE system description for NIST 2020 SRE CTS challenge
Figure 4 for THUEE system description for NIST 2020 SRE CTS challenge

This paper presents the system description of the THUEE team for the NIST 2020 Speaker Recognition Evaluation (SRE) conversational telephone speech (CTS) challenge. The subsystems including ResNet74, ResNet152, and RepVGG-B2 are developed as speaker embedding extractors in this evaluation. We used combined AM-Softmax and AAM-Softmax based loss functions, namely CM-Softmax. We adopted a two-staged training strategy to further improve system performance. We fused all individual systems as our final submission. Our approach leads to excellent performance and ranks 1st in the challenge.

* 3 pages, 1 table; System desciption of NIST 2020 SRE CTS challenge 
Viaarxiv icon