Alert button
Picture for Hongyi Zhang

Hongyi Zhang

Alert button

EdgeFL: A Lightweight Decentralized Federated Learning Framework

Sep 06, 2023
Hongyi Zhang, Jan Bosch, Helena Holmström Olsson

Figure 1 for EdgeFL: A Lightweight Decentralized Federated Learning Framework
Figure 2 for EdgeFL: A Lightweight Decentralized Federated Learning Framework
Figure 3 for EdgeFL: A Lightweight Decentralized Federated Learning Framework
Figure 4 for EdgeFL: A Lightweight Decentralized Federated Learning Framework

Federated Learning (FL) has emerged as a promising approach for collaborative machine learning, addressing data privacy concerns. However, existing FL platforms and frameworks often present challenges for software engineers in terms of complexity, limited customization options, and scalability limitations. In this paper, we introduce EdgeFL, an edge-only lightweight decentralized FL framework, designed to overcome the limitations of centralized aggregation and scalability in FL deployments. By adopting an edge-only model training and aggregation approach, EdgeFL eliminates the need for a central server, enabling seamless scalability across diverse use cases. With a straightforward integration process requiring just four lines of code (LOC), software engineers can easily incorporate FL functionalities into their AI products. Furthermore, EdgeFL offers the flexibility to customize aggregation functions, empowering engineers to adapt them to specific needs. Based on the results, we demonstrate that EdgeFL achieves superior performance compared to existing FL platforms/frameworks. Our results show that EdgeFL reduces weights update latency and enables faster model evolution, enhancing the efficiency of edge devices. Moreover, EdgeFL exhibits improved classification accuracy compared to traditional centralized FL approaches. By leveraging EdgeFL, software engineers can harness the benefits of federated learning while overcoming the challenges associated with existing FL platforms/frameworks.

Viaarxiv icon

5G Network on Wings: A Deep Reinforcement Learning Approach to UAV-based Integrated Access and Backhaul

Feb 07, 2022
Hongyi Zhang, Jingya Li, Zhiqiang Qi, Xingqin Lin, Anders Aronsson, Jan Bosch, Helena Holmström Olsson

Figure 1 for 5G Network on Wings: A Deep Reinforcement Learning Approach to UAV-based Integrated Access and Backhaul
Figure 2 for 5G Network on Wings: A Deep Reinforcement Learning Approach to UAV-based Integrated Access and Backhaul
Figure 3 for 5G Network on Wings: A Deep Reinforcement Learning Approach to UAV-based Integrated Access and Backhaul
Figure 4 for 5G Network on Wings: A Deep Reinforcement Learning Approach to UAV-based Integrated Access and Backhaul

Fast and reliable wireless communication has become a critical demand in human life. When natural disasters strike, providing ubiquitous connectivity becomes challenging by using traditional wireless networks. In this context, unmanned aerial vehicle (UAV) based aerial networks offer a promising alternative for fast, flexible, and reliable wireless communications in mission-critical (MC) scenarios. Due to the unique characteristics such as mobility, flexible deployment, and rapid reconfiguration, drones can readily change location dynamically to provide on-demand communications to users on the ground in emergency scenarios. As a result, the usage of UAV base stations (UAV-BSs) has been considered as an appropriate approach for providing rapid connection in MC scenarios. In this paper, we study how to control a UAV-BS in both static and dynamic environments. We investigate a situation in which a macro BS is destroyed as a result of a natural disaster and a UAV-BS is deployed using integrated access and backhaul (IAB) technology to provide coverage for users in the disaster area. We present a data collection system, signaling procedures and machine learning applications for this use case. A deep reinforcement learning algorithm is developed to jointly optimize the tilt of the access and backhaul antennas of the UAV-BS as well as its three-dimensional placement. Evaluation results show that the proposed algorithm can autonomously navigate and configure the UAV-BS to satisfactorily serve the MC users on the ground.

Viaarxiv icon

Autonomous Navigation and Configuration of Integrated Access Backhauling for UAV Base Station Using Reinforcement Learning

Dec 14, 2021
Hongyi Zhang, Jingya Li, Zhiqiang Qi, Xingqin Lin, Anders Aronsson, Jan Bosch, Helena Holmström Olsson

Figure 1 for Autonomous Navigation and Configuration of Integrated Access Backhauling for UAV Base Station Using Reinforcement Learning
Figure 2 for Autonomous Navigation and Configuration of Integrated Access Backhauling for UAV Base Station Using Reinforcement Learning
Figure 3 for Autonomous Navigation and Configuration of Integrated Access Backhauling for UAV Base Station Using Reinforcement Learning
Figure 4 for Autonomous Navigation and Configuration of Integrated Access Backhauling for UAV Base Station Using Reinforcement Learning

Fast and reliable connectivity is essential to enhancing situational awareness and operational efficiency for public safety mission-critical (MC) users. In emergency or disaster circumstances, where existing cellular network coverage and capacity may not be available to meet MC communication demands, deployable-network-based solutions such as cells-on-wheels/wings can be utilized swiftly to ensure reliable connection for MC users. In this paper, we consider a scenario where a macro base station (BS) is destroyed due to a natural disaster and an unmanned aerial vehicle carrying BS (UAV-BS) is set up to provide temporary coverage for users in the disaster area. The UAV-BS is integrated into the mobile network using the 5G integrated access and backhaul (IAB) technology. We propose a framework and signalling procedure for applying machine learning to this use case. A deep reinforcement learning algorithm is designed to jointly optimize the access and backhaul antenna tilt as well as the three-dimensional location of the UAV-BS in order to best serve the on-ground MC users while maintaining a good backhaul connection. Our result shows that the proposed algorithm can autonomously navigate and configure the UAV-BS to improve the throughput and reduce the drop rate of MC users.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 
Viaarxiv icon

Context-Aware Legal Citation Recommendation using Deep Learning

Jun 20, 2021
Zihan Huang, Charles Low, Mengqiu Teng, Hongyi Zhang, Daniel E. Ho, Mark S. Krass, Matthias Grabmair

Figure 1 for Context-Aware Legal Citation Recommendation using Deep Learning
Figure 2 for Context-Aware Legal Citation Recommendation using Deep Learning
Figure 3 for Context-Aware Legal Citation Recommendation using Deep Learning
Figure 4 for Context-Aware Legal Citation Recommendation using Deep Learning

Lawyers and judges spend a large amount of time researching the proper legal authority to cite while drafting decisions. In this paper, we develop a citation recommendation tool that can help improve efficiency in the process of opinion drafting. We train four types of machine learning models, including a citation-list based method (collaborative filtering) and three context-based methods (text similarity, BiLSTM and RoBERTa classifiers). Our experiments show that leveraging local textual context improves recommendation, and that deep neural models achieve decent performance. We show that non-deep text-based methods benefit from access to structured case metadata, but deep models only benefit from such access when predicting from context of insufficient length. We also find that, even after extensive training, RoBERTa does not outperform a recurrent neural model, despite its benefits of pretraining. Our behavior analysis of the RoBERTa model further shows that predictive performance is stable across time and citation classes.

* 10 pages published in Proceedings of ICAIL 2021; link to data here: https://reglab.stanford.edu/data/bva-case-citation-dataset ; code available here: https://github.com/TUMLegalTech/bva-citation-prediction 
Viaarxiv icon

One Backward from Ten Forward, Subsampling for Large-Scale Deep Learning

Apr 27, 2021
Chaosheng Dong, Xiaojie Jin, Weihao Gao, Yijia Wang, Hongyi Zhang, Xiang Wu, Jianchao Yang, Xiaobing Liu

Figure 1 for One Backward from Ten Forward, Subsampling for Large-Scale Deep Learning
Figure 2 for One Backward from Ten Forward, Subsampling for Large-Scale Deep Learning
Figure 3 for One Backward from Ten Forward, Subsampling for Large-Scale Deep Learning
Figure 4 for One Backward from Ten Forward, Subsampling for Large-Scale Deep Learning

Deep learning models in large-scale machine learning systems are often continuously trained with enormous data from production environments. The sheer volume of streaming training data poses a significant challenge to real-time training subsystems and ad-hoc sampling is the standard practice. Our key insight is that these deployed ML systems continuously perform forward passes on data instances during inference, but ad-hoc sampling does not take advantage of this substantial computational effort. Therefore, we propose to record a constant amount of information per instance from these forward passes. The extra information measurably improves the selection of which data instances should participate in forward and backward passes. A novel optimization framework is proposed to analyze this problem and we provide an efficient approximation algorithm under the framework of Mini-batch gradient descent as a practical solution. We also demonstrate the effectiveness of our framework and algorithm on several large-scale classification and regression tasks, when compared with competitive baselines widely used in industry.

* 13 pages 
Viaarxiv icon

Real-time End-to-End Federated Learning: An Automotive Case Study

Mar 22, 2021
Hongyi Zhang, Jan Bosch, Helena Holmström Olsson

Figure 1 for Real-time End-to-End Federated Learning: An Automotive Case Study
Figure 2 for Real-time End-to-End Federated Learning: An Automotive Case Study
Figure 3 for Real-time End-to-End Federated Learning: An Automotive Case Study
Figure 4 for Real-time End-to-End Federated Learning: An Automotive Case Study

With the development and the increasing interests in ML/DL fields, companies are eager to utilize these methods to improve their service quality and user experience. Federated Learning has been introduced as an efficient model training approach to distribute and speed up time-consuming model training and preserve user data privacy. However, common Federated Learning methods apply a synchronized protocol to perform model aggregation, which turns out to be inflexible and unable to adapt to rapidly evolving environments and heterogeneous hardware settings in real-world systems. In this paper, we introduce an approach to real-time end-to-end Federated Learning combined with a novel asynchronous model aggregation protocol. We validate our approach in an industrial use case in the automotive domain focusing on steering wheel angle prediction for autonomous driving. Our results show that asynchronous Federated Learning can significantly improve the prediction performance of local edge models and reach the same accuracy level as the centralized machine learning method. Moreover, the approach can reduce the communication overhead, accelerate model training speed and consume real-time streaming data by utilizing a sliding training window, which proves high efficiency when deploying ML/DL components to heterogeneous real-world embedded systems.

Viaarxiv icon

Label Leakage and Protection in Two-party Split Learning

Feb 17, 2021
Oscar Li, Jiankai Sun, Xin Yang, Weihao Gao, Hongyi Zhang, Junyuan Xie, Virginia Smith, Chong Wang

Figure 1 for Label Leakage and Protection in Two-party Split Learning
Figure 2 for Label Leakage and Protection in Two-party Split Learning
Figure 3 for Label Leakage and Protection in Two-party Split Learning
Figure 4 for Label Leakage and Protection in Two-party Split Learning

In vertical federated learning, two-party split learning has become an important topic and has found many applications in real business scenarios. However, how to prevent the participants' ground-truth labels from possible leakage is not well studied. In this paper, we consider answering this question in an imbalanced binary classification setting, a common case in online business applications. We first show that, norm attack, a simple method that uses the norm of the communicated gradients between the parties, can largely reveal the ground-truth labels from the participants. We then discuss several protection techniques to mitigate this issue. Among them, we have designed a principled approach that directly maximizes the worst-case error of label detection. This is proved to be more effective in countering norm attack and beyond. We experimentally demonstrate the competitiveness of our proposed method compared to several other baselines.

Viaarxiv icon

Fixup Initialization: Residual Learning Without Normalization

Jan 27, 2019
Hongyi Zhang, Yann N. Dauphin, Tengyu Ma

Figure 1 for Fixup Initialization: Residual Learning Without Normalization
Figure 2 for Fixup Initialization: Residual Learning Without Normalization
Figure 3 for Fixup Initialization: Residual Learning Without Normalization
Figure 4 for Fixup Initialization: Residual Learning Without Normalization

Normalization layers are a staple in state-of-the-art deep neural network architectures. They are widely believed to stabilize training, enable higher learning rate, accelerate convergence and improve generalization, though the reason for their effectiveness is still an active research topic. In this work, we challenge the commonly-held beliefs by showing that none of the perceived benefits is unique to normalization. Specifically, we propose fixed-update initialization (Fixup), an initialization motivated by solving the exploding and vanishing gradient problem at the beginning of training via properly rescaling a standard initialization. We find training residual networks with Fixup to be as stable as training with normalization -- even for networks with 10,000 layers. Furthermore, with proper regularization, Fixup enables residual networks without normalization to achieve state-of-the-art performance in image classification and machine translation.

* Accepted for publication at ICLR 2019; see https://openreview.net/forum?id=H1gsz30cKX 
Viaarxiv icon

R-SPIDER: A Fast Riemannian Stochastic Optimization Algorithm with Curvature Independent Rate

Nov 28, 2018
Jingzhao Zhang, Hongyi Zhang, Suvrit Sra

Figure 1 for R-SPIDER: A Fast Riemannian Stochastic Optimization Algorithm with Curvature Independent Rate
Figure 2 for R-SPIDER: A Fast Riemannian Stochastic Optimization Algorithm with Curvature Independent Rate

We study smooth stochastic optimization problems on Riemannian manifolds. Via adapting the recently proposed SPIDER algorithm \citep{fang2018spider} (a variance reduced stochastic method) to Riemannian manifold, we can achieve faster rate than known algorithms in both the finite sum and stochastic settings. Unlike previous works, by \emph{not} resorting to bounding iterate distances, our analysis yields curvature independent convergence rates for both the nonconvex and strongly convex cases.

* arXiv admin note: text overlap with arXiv:1605.07147 
Viaarxiv icon

Towards Riemannian Accelerated Gradient Methods

Jun 07, 2018
Hongyi Zhang, Suvrit Sra

Figure 1 for Towards Riemannian Accelerated Gradient Methods
Figure 2 for Towards Riemannian Accelerated Gradient Methods

We propose a Riemannian version of Nesterov's Accelerated Gradient algorithm (RAGD), and show that for geodesically smooth and strongly convex problems, within a neighborhood of the minimizer whose radius depends on the condition number as well as the sectional curvature of the manifold, RAGD converges to the minimizer with acceleration. Unlike the algorithm in (Liu et al., 2017) that requires the exact solution to a nonlinear equation which in turn may be intractable, our algorithm is constructive and computationally tractable. Our proof exploits a new estimate sequence and a novel bound on the nonlinear metric distortion, both ideas may be of independent interest.

* Published in 31th Annual Conference on Learning Theory (COLT'18) 
Viaarxiv icon