Edge intelligence has arisen as a promising computing paradigm for supporting miscellaneous smart applications that rely on machine learning techniques. While the community has extensively investigated multi-tier edge deployment for traditional deep learning models (e.g. CNNs, RNNs), the emerging Graph Neural Networks (GNNs) are still under exploration, presenting a stark disparity to its broad edge adoptions such as traffic flow forecasting and location-based social recommendation. To bridge this gap, this paper formally studies the cost optimization for distributed GNN processing over a multi-tier heterogeneous edge network. We build a comprehensive modeling framework that can capture a variety of different cost factors, based on which we formulate a cost-efficient graph layout optimization problem that is proved to be NP-hard. Instead of trivially applying traditional data placement wisdom, we theoretically reveal the structural property of quadratic submodularity implicated in GNN's unique computing pattern, which motivates our design of an efficient iterative solution exploiting graph cuts. Rigorous analysis shows that it provides parameterized constant approximation ratio, guaranteed convergence, and exact feasibility. To tackle potential graph topological evolution in GNN processing, we further devise an incremental update strategy and an adaptive scheduling algorithm for lightweight dynamic layout optimization. Evaluations with real-world datasets and various GNN benchmarks demonstrate that our approach achieves superior performance over de facto baselines with more than 95.8% cost eduction in a fast convergence speed.
In this work, we develop an efficient solver based on deep neural networks for the Poisson equation with variable coefficients and singular sources expressed by the Dirac delta function $\delta(\mathbf{x})$. This class of problems covers general point sources, line sources and point-line combinations, and has a broad range of practical applications. The proposed approach is based on decomposing the true solution into a singular part that is known analytically using the fundamental solution of the Laplace equation and a regular part that satisfies a suitable elliptic PDE with smoother sources, and then solving for the regular part using the deep Ritz method. A path-following strategy is suggested to select the penalty parameter for penalizing the Dirichlet boundary condition. Extensive numerical experiments in two- and multi-dimensional spaces with point sources, line sources or their combinations are presented to illustrate the efficiency of the proposed approach, and a comparative study with several existing approaches is also given, which shows clearly its competitiveness for the specific class of problems. In addition, we briefly discuss the error analysis of the approach.
Semi-supervised learning (SSL) improves model generalization by leveraging massive unlabeled data to augment limited labeled samples. However, currently, popular SSL evaluation protocols are often constrained to computer vision (CV) tasks. In addition, previous work typically trains deep neural networks from scratch, which is time-consuming and environmentally unfriendly. To address the above issues, we construct a Unified SSL Benchmark (USB) by selecting 15 diverse, challenging, and comprehensive tasks from CV, natural language processing (NLP), and audio processing (Audio), on which we systematically evaluate dominant SSL methods, and also open-source a modular and extensible codebase for fair evaluation on these SSL methods. We further provide pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning. USB enables the evaluation of a single SSL algorithm on more tasks from multiple domains but with less cost. Specifically, on a single NVIDIA V100, only 37 GPU days are required to evaluate FixMatch on 15 tasks in USB while 335 GPU days (279 GPU days on 4 CV datasets except for ImageNet) are needed on 5 CV tasks with the typical protocol.
LAMDA-SSL is open-sourced on GitHub and its detailed usage documentation is available at https://ygzwqzd.github.io/LAMDA-SSL/. This documentation introduces LAMDA-SSL in detail from various aspects and can be divided into four parts. The first part introduces the design idea, features and functions of LAMDA-SSL. The second part shows the usage of LAMDA-SSL by abundant examples in detail. The third part introduces all algorithms implemented by LAMDA-SSL to help users quickly understand and choose SSL algorithms. The fourth part shows the APIs of LAMDA-SSL. This detailed documentation greatly reduces the cost of familiarizing users with LAMDA-SSL toolkit and SSL algorithms.
Semi-supervised learning (SSL) is the branch of machine learning that aims to improve learning performance by leveraging unlabeled data when labels are insufficient. Recently, SSL with deep models has proven to be successful on standard benchmark tasks. However, they are still vulnerable to various robustness threats in real-world applications as these benchmarks provide perfect unlabeled data, while in realistic scenarios, unlabeled data could be corrupted. Many researchers have pointed out that after exploiting corrupted unlabeled data, SSL suffers severe performance degradation problems. Thus, there is an urgent need to develop SSL algorithms that could work robustly with corrupted unlabeled data. To fully understand robust SSL, we conduct a survey study. We first clarify a formal definition of robust SSL from the perspective of machine learning. Then, we classify the robustness threats into three categories: i) distribution corruption, i.e., unlabeled data distribution is mismatched with labeled data; ii) feature corruption, i.e., the features of unlabeled examples are adversarially attacked; and iii) label corruption, i.e., the label distribution of unlabeled data is imbalanced. Under this unified taxonomy, we provide a thorough review and discussion of recent works that focus on these issues. Finally, we propose possible promising directions within robust SSL to provide insights for future research.
With the wide penetration of smart robots in multifarious fields, Simultaneous Localization and Mapping (SLAM) technique in robotics has attracted growing attention in the community. Yet collaborating SLAM over multiple robots still remains challenging due to performance contradiction between the intensive graphics computation of SLAM and the limited computing capability of robots. While traditional solutions resort to the powerful cloud servers acting as an external computation provider, we show by real-world measurements that the significant communication overhead in data offloading prevents its practicability to real deployment. To tackle these challenges, this paper promotes the emerging edge computing paradigm into multi-robot SLAM and proposes RecSLAM, a multi-robot laser SLAM system that focuses on accelerating map construction process under the robot-edge-cloud architecture. In contrast to conventional multi-robot SLAM that generates graphic maps on robots and completely merges them on the cloud, RecSLAM develops a hierarchical map fusion technique that directs robots' raw data to edge servers for real-time fusion and then sends to the cloud for global merging. To optimize the overall pipeline, an efficient multi-robot SLAM collaborative processing framework is introduced to adaptively optimize robot-to-edge offloading tailored to heterogeneous edge resource conditions, meanwhile ensuring the workload balancing among the edge servers. Extensive evaluations show RecSLAM can achieve up to 39% processing latency reduction over the state-of-the-art. Besides, a proof-of-concept prototype is developed and deployed in real scenes to demonstrate its effectiveness.
To meet the ever increasing mobile traffic demand in 5G era, base stations (BSs) have been densely deployed in radio access networks (RANs) to increase the network coverage and capacity. However, as the high density of BSs is designed to accommodate peak traffic, it would consume an unnecessarily large amount of energy if BSs are on during off-peak time. To save the energy consumption of cellular networks, an effective way is to deactivate some idle base stations that do not serve any traffic demand. In this paper, we develop a traffic-aware dynamic BS sleep control framework, named DeepBSC, which presents a novel data-driven learning approach to determine the BS active/sleep modes while meeting lower energy consumption and satisfactory Quality of Service (QoS) requirements. Specifically, the traffic demands are predicted by the proposed GS-STN model, which leverages the geographical and semantic spatial-temporal correlations of mobile traffic. With accurate mobile traffic forecasting, the BS sleep control problem is cast as a Markov Decision Process that is solved by Actor-Critic reinforcement learning methods. To reduce the variance of cost estimation in the dynamic environment, we propose a benchmark transformation method that provides robust performance indicator for policy update. To expedite the training process, we adopt a Deep Deterministic Policy Gradient (DDPG) approach, together with an explorer network, which can strengthen the exploration further. Extensive experiments with a real-world dataset corroborate that our proposed framework significantly outperforms the existing methods.
In-home health monitoring has attracted great attention for the ageing population worldwide. With the abundant user health data accessed by Internet of Things (IoT) devices and recent development in machine learning, smart healthcare has seen many successful stories. However, existing approaches for in-home health monitoring do not pay sufficient attention to user data privacy and thus are far from being ready for large-scale practical deployment. In this paper, we propose FedHome, a novel cloud-edge based federated learning framework for in-home health monitoring, which learns a shared global model in the cloud from multiple homes at the network edges and achieves data privacy protection by keeping user data locally. To cope with the imbalanced and non-IID distribution inherent in user's monitoring data, we design a generative convolutional autoencoder (GCAE), which aims to achieve accurate and personalized health monitoring by refining the model with a generated class-balanced dataset from user's personal data. Besides, GCAE is lightweight to transfer between the cloud and edges, which is useful to reduce the communication cost of federated learning in FedHome. Extensive experiments based on realistic human activity recognition data traces corroborate that FedHome significantly outperforms existing widely-adopted methods.
Recent advances in artificial intelligence have driven increasing intelligent applications at the network edge, such as smart home, smart factory, and smart city. To deploy computationally intensive Deep Neural Networks (DNNs) on resource-constrained edge devices, traditional approaches have relied on either offloading workload to the remote cloud or optimizing computation at the end device locally. However, the cloud-assisted approaches suffer from the unreliable and delay-significant wide-area network, and the local computing approaches are limited by the constrained computing capability. Towards high-performance edge intelligence, the cooperative execution mechanism offers a new paradigm, which has attracted growing research interest recently. In this paper, we propose CoEdge, a distributed DNN computing system that orchestrates cooperative DNN inference over heterogeneous edge devices. CoEdge utilizes available computation and communication resources at the edge and dynamically partitions the DNN inference workload adaptive to devices' computing capabilities and network conditions. Experimental evaluations based on a realistic prototype show that CoEdge outperforms status-quo approaches in saving energy with close inference latency, achieving up to 25.5%~66.9% energy reduction for four widely-adopted CNN models.