Data-driven paradigms are well-known and salient demands of future wireless communication. Empowered by big data and machine learning, next-generation data-driven communication systems will be intelligent with the characteristics of expressiveness, scalability, interpretability, and especially uncertainty modeling, which can confidently involve diversified latent demands and personalized services in the foreseeable future. In this paper, we review and present a promising family of nonparametric Bayesian machine learning methods, i.e., Gaussian processes (GPs), and their applications in wireless communication due to their interpretable learning ability with uncertainty. Specifically, we first envision three-level motivations of data-driven wireless communication using GPs. Then, we provide the background of the GP model in terms of covariance structure and model inference. The expressiveness of the GP model is introduced by using various interpretable kernel designs, namely, stationary, non-stationary, deep, and multi-task kernels. Furthermore, we review the distributed GP with promising scalability, which is suitable for applications in wireless networks with a large number of distributed edge devices. Finally, we provide representative solutions and promising techniques that adopting GPs in wireless communication systems.
The right to be forgotten has been legislated in many countries but the enforcement in machine learning would cause unbearable costs: companies may need to delete whole models learned from massive resources due to single individual requests. Existing works propose to remove the knowledge learned from the requested data via its influence function which is no longer naturally well-defined in Bayesian inference. This paper proposes a {\it Bayesian inference forgetting} (BIF) framework to realize the right to be forgotten in Bayesian inference. In the BIF framework, we develop forgetting algorithms for variational inference and Markov chain Monte Carlo. We show that our algorithms can provably remove the influence of single datums on the learned models. Theoretical analysis demonstrates that our algorithms have guaranteed generalizability. Experiments of Gaussian mixture models on the synthetic dataset and Bayesian neural networks on the real-world data verify the feasibility of our methods. The source code package is available at \url{https://github.com/fshp971/BIF}.
Graph Convolutional Networks (GCNs) and their variants have received significant attention and achieved start-of-the-art performances on various recommendation tasks. However, many existing GCN models tend to perform recursive aggregations among all related nodes, which arises severe computational burden. Moreover, they favor multi-layer architectures in conjunction with complicated modeling techniques. Though effective, the excessive amount of model parameters largely hinder their applications in real-world recommender systems. To this end, in this paper, we propose the single-layer GCN model which is able to achieve superior performance along with remarkably less complexity compared with existing models. Our main contribution is three-fold. First, we propose a principled similarity metric named distribution-aware similarity (DA similarity), which can guide the neighbor sampling process and evaluate the quality of the input graph explicitly. We also prove that DA similarity has a positive correlation with the final performance, through both theoretical analysis and empirical simulations. Second, we propose a simplified GCN architecture which employs a single GCN layer to aggregate information from the neighbors filtered by DA similarity and then generates the node representations. Moreover, the aggregation step is a parameter-free operation, such that it can be done in a pre-processing manner to further reduce red the training and inference costs. Third, we conduct extensive experiments on four datasets. The results verify that the proposed model outperforms existing GCN models considerably and yields up to a few orders of magnitude speedup in training, in terms of the recommendation performance.
Existing image-based activity understanding methods mainly adopt direct mapping, i.e. from image to activity concepts, which may encounter performance bottleneck since the huge gap. In light of this, we propose a new path: infer human part states first and then reason out the activities based on part-level semantics. Human Body Part States (PaSta) are fine-grained action semantic tokens, e.g. <hand, hold, something>, which can compose the activities and help us step toward human activity knowledge engine. To fully utilize the power of PaSta, we build a large-scale knowledge base PaStaNet, which contains 7M+ PaSta annotations. And two corresponding models are proposed: first, we design a model named Activity2Vec to extract PaSta features, which aim to be general representations for various activities. Second, we use a PaSta-based Reasoning method to infer activities. Promoted by PaStaNet, our method achieves significant improvements, e.g. 6.4 and 13.9 mAP on full and one-shot sets of HICO in supervised learning, and 3.2 and 4.2 mAP on V-COCO and images-based AVA in transfer learning. Code and data are available at http://hake-mvig.cn/.
Attributes and objects can compose diverse compositions. To model the compositional nature of these general concepts, it is a good choice to learn them through transformations, such as coupling and decoupling. However, complex transformations need to satisfy specific principles to guarantee the rationality. In this paper, we first propose a previously ignored principle of attribute-object transformation: Symmetry. For example, coupling peeled-apple with attribute peeled should result in peeled-apple, and decoupling peeled from apple should still output apple. Incorporating the symmetry principle, a transformation framework inspired by group theory is built, i.e. SymNet. SymNet consists of two modules, Coupling Network and Decoupling Network. With the group axioms and symmetry property as objectives, we adopt Deep Neural Networks to implement SymNet and train it in an end-to-end paradigm. Moreover, we propose a Relative Moving Distance (RMD) based recognition method to utilize the attribute change instead of the attribute pattern itself to classify attributes. Our symmetry learning can be utilized for the Compositional Zero-Shot Learning task and outperforms the state-of-the-art on widely-used benchmarks. Code is available at https://github.com/DirtyHarryLYL/SymNet.
In this paper, we propose a new localization framework in which mobile users or smart agents can cooperate to build accurate location services without sacrificing privacy, in particular, information related to their trajectories. The proposed framework is called Federated Localization (FedLoc), simply because it adopts the recently proposed federated learning. Apart from the new FedLoc framework, this paper can be deemed as an overview paper, in which we review the state-of-the-art federated learning framework, two widely used learning models, various distributed model hyper-parameter optimization schemes, and some practical use cases that fall under the FedLoc framework. The use cases, summarized from a mixture of standard, recently published, and unpublished works, cover a broad range of location services, including collaborative static localization/fingerprinting, indoor target tracking, outdoor navigation using low-sampling GPS, and spatio-temporal wireless traffic data modeling and prediction. The obtained primary results confirm that the proposed FedLoc framework well suits data-driven, machine learning-based localization and spatio-temporal data modeling. Future research directions are discussed at the end of this paper.
The marriage of wireless big data and machine learning techniques revolutionizes the wireless system by the data-driven philosophy. However, the ever exploding data volume and model complexity will limit centralized solutions to learn and respond within a reasonable time. Therefore, scalability becomes a critical issue to be solved. In this article, we aim to provide a systematic discussion on the building blocks of scalable data-driven wireless networks. On one hand, we discuss the forward-looking architecture and computing framework of scalable data-driven systems from a global perspective. On the other hand, we discuss the learning algorithms and model training strategies performed at each individual node from a local perspective. We also highlight several promising research directions in the context of scalable data-driven wireless communications to inspire future research.
The recent success of single-agent reinforcement learning (RL) encourages the exploration of multi-agent reinforcement learning (MARL), which is more challenging due to the interactions among different agents. In this paper, we consider a voting-based MARL problem, in which the agents vote to make group decisions and the goal is to maximize the globally averaged returns. To this end, we formulate the MARL problem based on the linear programming form of the policy optimization problem and propose a distributed primal-dual algorithm to obtain the optimal solution. We also propose a voting mechanism through which the distributed learning achieves the same sub-linear convergence rate as centralized learning. In other words, the distributed decision making does not slow down the global consensus to optimal. We also verify the convergence of our proposed algorithm with numerical simulations and conduct case studies in practical multi-agent systems.