Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lan Zhang

Information School Capital University of Economics and Business, China

Fed-CPrompt: Contrastive Prompt for Rehearsal-Free Federated Continual Learning

Jul 10, 2023

Gaurav Bagwe, Xiaoyong Yuan, Miao Pan, Lan Zhang

Abstract:Federated continual learning (FCL) learns incremental tasks over time from confidential datasets distributed across clients. This paper focuses on rehearsal-free FCL, which has severe forgetting issues when learning new tasks due to the lack of access to historical task data. To address this issue, we propose Fed-CPrompt based on prompt learning techniques to obtain task-specific prompts in a communication-efficient way. Fed-CPrompt introduces two key components, asynchronous prompt learning, and contrastive continual loss, to handle asynchronous task arrival and heterogeneous data distributions in FCL, respectively. Extensive experiments demonstrate the effectiveness of Fed-CPrompt in achieving SOTA rehearsal-free FCL performance.

* Accepted by FL-ICML 2023

Via

Access Paper or Ask Questions

Tight Memory-Regret Lower Bounds for Streaming Bandits

Jun 13, 2023

Shaoang Li, Lan Zhang, Junhao Wang, Xiang-Yang Li

Figure 1 for Tight Memory-Regret Lower Bounds for Streaming Bandits

Abstract:In this paper, we investigate the streaming bandits problem, wherein the learner aims to minimize regret by dealing with online arriving arms and sublinear arm memory. We establish the tight worst-case regret lower bound of $\Omega \left( (TB)^{\alpha} K^{1-\alpha}\right), \alpha = 2^{B} / (2^{B+1}-1)$ for any algorithm with a time horizon $T$, number of arms $K$, and number of passes $B$. The result reveals a separation between the stochastic bandits problem in the classical centralized setting and the streaming setting with bounded arm memory. Notably, in comparison to the well-known $\Omega(\sqrt{KT})$ lower bound, an additional double logarithmic factor is unavoidable for any streaming bandits algorithm with sublinear memory permitted. Furthermore, we establish the first instance-dependent lower bound of $\Omega \left(T^{1/(B+1)} \sum_{\Delta_x>0} \frac{\mu^*}{\Delta_x}\right)$ for streaming bandits. These lower bounds are derived through a unique reduction from the regret-minimization setting to the sample complexity analysis for a sequence of $\epsilon$-optimal arms identification tasks, which maybe of independent interest. To complement the lower bound, we also provide a multi-pass algorithm that achieves a regret upper bound of $\tilde{O} \left( (TB)^{\alpha} K^{1 - \alpha}\right)$ using constant arm memory.

Via

Access Paper or Ask Questions

FedSDG-FS: Efficient and Secure Feature Selection for Vertical Federated Learning

Feb 21, 2023

Anran Li, Hongyi Peng, Lan Zhang, Jiahui Huang, Qing Guo, Han Yu, Yang Liu

Abstract:Vertical Federated Learning (VFL) enables multiple data owners, each holding a different subset of features about largely overlapping sets of data sample(s), to jointly train a useful global model. Feature selection (FS) is important to VFL. It is still an open research problem as existing FS works designed for VFL either assumes prior knowledge on the number of noisy features or prior knowledge on the post-training threshold of useful features to be selected, making them unsuitable for practical applications. To bridge this gap, we propose the Federated Stochastic Dual-Gate based Feature Selection (FedSDG-FS) approach. It consists of a Gaussian stochastic dual-gate to efficiently approximate the probability of a feature being selected, with privacy protection through Partially Homomorphic Encryption without a trusted third-party. To reduce overhead, we propose a feature importance initialization method based on Gini impurity, which can accomplish its goals with only two parameter transmissions between the server and the clients. Extensive experiments on both synthetic and real-world datasets show that FedSDG-FS significantly outperforms existing approaches in terms of achieving accurate selection of high-quality features as well as building global models with improved performance.

* 10 pages, 8 figures

Via

Access Paper or Ask Questions

Which Features are Learned by CodeBert: An Empirical Study of the BERT-based Source Code Representation Learning

Jan 20, 2023

Lan Zhang, Chen Cao, Zhilong Wang, Peng Liu

Figure 1 for Which Features are Learned by CodeBert: An Empirical Study of the BERT-based Source Code Representation Learning

Figure 2 for Which Features are Learned by CodeBert: An Empirical Study of the BERT-based Source Code Representation Learning

Abstract:The Bidirectional Encoder Representations from Transformers (BERT) were proposed in the natural language process (NLP) and shows promising results. Recently researchers applied the BERT to source-code representation learning and reported some good news on several downstream tasks. However, in this paper, we illustrated that current methods cannot effectively understand the logic of source codes. The representation of source code heavily relies on the programmer-defined variable and function names. We design and implement a set of experiments to demonstrate our conjecture and provide some insights for future works.

* 1 table, 2 figures

Via

Access Paper or Ask Questions

FedTiny: Pruned Federated Learning Towards Specialized Tiny Models

Dec 05, 2022

Hong Huang, Lan Zhang, Chaoyue Sun, Ruogu Fang, Xiaoyong Yuan, Dapeng Wu

Figure 1 for FedTiny: Pruned Federated Learning Towards Specialized Tiny Models

Figure 2 for FedTiny: Pruned Federated Learning Towards Specialized Tiny Models

Figure 3 for FedTiny: Pruned Federated Learning Towards Specialized Tiny Models

Figure 4 for FedTiny: Pruned Federated Learning Towards Specialized Tiny Models

Abstract:Neural network pruning has been a well-established compression technique to enable deep learning models on resource-constrained devices. The pruned model is usually specialized to meet specific hardware platforms and training tasks (defined as deployment scenarios). However, existing pruning approaches rely heavily on training data to trade off model size, efficiency, and accuracy, which becomes ineffective for federated learning (FL) over distributed and confidential datasets. Moreover, the memory- and compute-intensive pruning process of most existing approaches cannot be handled by most FL devices with resource limitations. In this paper, we develop FedTiny, a novel distributed pruning framework for FL, to obtain specialized tiny models for memory- and computing-constrained participating devices with confidential local data. To alleviate biased pruning due to unseen heterogeneous data over devices, FedTiny introduces an adaptive batch normalization (BN) selection module to adaptively obtain an initially pruned model to fit deployment scenarios. Besides, to further improve the initial pruning, FedTiny develops a lightweight progressive pruning module for local finer pruning under tight memory and computational budgets, where the pruning policy for each layer is gradually determined rather than evaluating the overall deep model structure. Extensive experimental results demonstrate the effectiveness of FedTiny, which outperforms state-of-the-art baseline approaches, especially when compressing deep models to extremely sparse tiny models.

Via

Access Paper or Ask Questions

MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference

Sep 28, 2022

Mu Yuan, Lan Zhang, Zimu Zheng, Yi-Nan Zhang, Xiang-Yang Li

Figure 1 for MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference

Figure 2 for MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference

Figure 3 for MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference

Figure 4 for MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference

Abstract:The cost efficiency of model inference is critical to real-world machine learning (ML) applications, especially for delay-sensitive tasks and resource-limited devices. A typical dilemma is: in order to provide complex intelligent services (e.g. smart city), we need inference results of multiple ML models, but the cost budget (e.g. GPU memory) is not enough to run all of them. In this work, we study underlying relationships among black-box ML models and propose a novel learning task: model linking, which aims to bridge the knowledge of different black-box models by learning mappings (dubbed model links) between their output spaces. We propose the design of model links which supports linking heterogeneous black-box ML models. Also, in order to address the distribution discrepancy challenge, we present adaptation and aggregation methods of model links. Based on our proposed model links, we developed a scheduling algorithm, named MLink. Through collaborative multi-model inference enabled by model links, MLink can improve the accuracy of obtained inference results under the cost budget. We evaluated MLink on a multi-modal dataset with seven different ML models and two real-world video analytics systems with six ML models and 3,264 hours of video. Experimental results show that our proposed model links can be effectively built among various black-box models. Under the budget of GPU memory, MLink can save 66.7% inference computations while preserving 94% inference accuracy, which outperforms multi-task learning, deep reinforcement learning-based scheduler and frame filtering baselines.

* 36th AAAI Conference on Artificial Intelligence (AAAI '22)

Via

Access Paper or Ask Questions

InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric Inference

Sep 28, 2022

Mu Yuan, Lan Zhang, Fengxiang He, Xueting Tong, Miao-Hui Song, Xiang-Yang Li

Figure 1 for InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric Inference

Figure 2 for InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric Inference

Figure 3 for InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric Inference

Figure 4 for InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric Inference

Abstract:Mobile-centric AI applications have high requirements for resource-efficiency of model inference. Input filtering is a promising approach to eliminate the redundancy so as to reduce the cost of inference. Previous efforts have tailored effective solutions for many applications, but left two essential questions unanswered: (1) theoretical filterability of an inference workload to guide the application of input filtering techniques, thereby avoiding the trial-and-error cost for resource-constrained mobile applications; (2) robust discriminability of feature embedding to allow input filtering to be widely effective for diverse inference tasks and input content. To answer them, we first formalize the input filtering problem and theoretically compare the hypothesis complexity of inference models and input filters to understand the optimization potential. Then we propose the first end-to-end learnable input filtering framework that covers most state-of-the-art methods and surpasses them in feature embedding with robust discriminability. We design and implement InFi that supports six input modalities and multiple mobile-centric deployments. Comprehensive evaluations confirm our theoretical results and show that InFi outperforms strong baselines in applicability, accuracy, and efficiency. InFi achieve 8.5x throughput and save 95% bandwidth, while keeping over 90% accuracy, for a video analytics application on mobile platforms.

* 28th Annual International Conference on Mobile Computing And Networking (MobiCom '22)

Via

Access Paper or Ask Questions

Federated Semi-Supervised Domain Adaptation via Knowledge Transfer

Jul 25, 2022

Madhureeta Das, Xianhao Chen, Xiaoyong Yuan, Lan Zhang

Figure 1 for Federated Semi-Supervised Domain Adaptation via Knowledge Transfer

Figure 2 for Federated Semi-Supervised Domain Adaptation via Knowledge Transfer

Figure 3 for Federated Semi-Supervised Domain Adaptation via Knowledge Transfer

Figure 4 for Federated Semi-Supervised Domain Adaptation via Knowledge Transfer

Abstract:Given the rapidly changing machine learning environments and expensive data labeling, semi-supervised domain adaptation (SSDA) is imperative when the labeled data from the source domain is statistically different from the partially labeled data from the target domain. Most prior SSDA research is centrally performed, requiring access to both source and target data. However, data in many fields nowadays is generated by distributed end devices. Due to privacy concerns, the data might be locally stored and cannot be shared, resulting in the ineffectiveness of existing SSDA research. This paper proposes an innovative approach to achieve SSDA over multiple distributed and confidential datasets, named by Federated Semi-Supervised Domain Adaptation (FSSDA). FSSDA integrates SSDA with federated learning based on strategically designed knowledge distillation techniques, whose efficiency is improved by performing source and target training in parallel. Moreover, FSSDA controls the amount of knowledge transferred across domains by properly selecting a key parameter, i.e., the imitation parameter. Further, the proposed FSSDA can be effectively generalized to multi-source domain adaptation scenarios. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of FSSDA design.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions

Topology-aware Generalization of Decentralized SGD

Jun 28, 2022

Tongtian Zhu, Fengxiang He, Lan Zhang, Zhengyang Niu, Mingli Song, Dacheng Tao

Figure 1 for Topology-aware Generalization of Decentralized SGD

Figure 2 for Topology-aware Generalization of Decentralized SGD

Figure 3 for Topology-aware Generalization of Decentralized SGD

Figure 4 for Topology-aware Generalization of Decentralized SGD

Abstract:This paper studies the algorithmic stability and generalizability of decentralized stochastic gradient descent (D-SGD). We prove that the consensus model learned by D-SGD is $\mathcal{O}{(m/N+1/m+\lambda^2)}$-stable in expectation in the non-convex non-smooth setting, where $N$ is the total sample size of the whole system, $m$ is the worker number, and $1-\lambda$ is the spectral gap that measures the connectivity of the communication topology. These results then deliver an $\mathcal{O}{(1/N+{({(m^{-1}\lambda^2)}^{\frac{\alpha}{2}}+ m^{-\alpha})}/{N^{1-\frac{\alpha}{2}}})}$ in-average generalization bound, which is non-vacuous even when $\lambda$ is closed to $1$, in contrast to vacuous as suggested by existing literature on the projected version of D-SGD. Our theory indicates that the generalizability of D-SGD has a positive correlation with the spectral gap, and can explain why consensus control in initial training phase can ensure better generalization. Experiments of VGG-11 and ResNet-18 on CIFAR-10, CIFAR-100 and Tiny-ImageNet justify our theory. To our best knowledge, this is the first work on the topology-aware generalization of vanilla D-SGD. Code is available at https://github.com/Raiden-Zhu/Generalization-of-DSGD.

* Accepted for publication in ICML 2022

Via

Access Paper or Ask Questions

Residue-based Label Protection Mechanisms in Vertical Logistic Regression

May 09, 2022

Juntao Tan, Lan Zhang, Yang Liu, Anran Li, Ye Wu

Figure 1 for Residue-based Label Protection Mechanisms in Vertical Logistic Regression

Figure 2 for Residue-based Label Protection Mechanisms in Vertical Logistic Regression

Figure 3 for Residue-based Label Protection Mechanisms in Vertical Logistic Regression

Figure 4 for Residue-based Label Protection Mechanisms in Vertical Logistic Regression

Abstract:Federated learning (FL) enables distributed participants to collaboratively learn a global model without revealing their private data to each other. Recently, vertical FL, where the participants hold the same set of samples but with different features, has received increased attention. This paper first presents one label inference attack method to investigate the potential privacy leakages of the vertical logistic regression model. Specifically, we discover that the attacker can utilize the residue variables, which are calculated by solving the system of linear equations constructed by local dataset and the received decrypted gradients, to infer the privately owned labels. To deal with this, we then propose three protection mechanisms, e.g., additive noise mechanism, multiplicative noise mechanism, and hybrid mechanism which leverages local differential privacy and homomorphic encryption techniques, to prevent the attack and improve the robustness of the vertical logistic regression. model. Experimental results show that both the additive noise mechanism and the multiplicative noise mechanism can achieve efficient label protection with only a slight drop in model testing accuracy, furthermore, the hybrid mechanism can achieve label protection without any testing accuracy degradation, which demonstrates the effectiveness and efficiency of our protection techniques

* Accepted by 8th International Conference on Big Data Computing and Communications (BigCom) 2022

Via

Access Paper or Ask Questions