Future Internet involves several emerging technologies such as 5G and beyond 5G networks, vehicular networks, unmanned aerial vehicle (UAV) networks, and Internet of Things (IoTs). Moreover, future Internet becomes heterogeneous and decentralized with a large number of involved network entities. Each entity may need to make its local decision to improve the network performance under dynamic and uncertain network environments. Standard learning algorithms such as single-agent Reinforcement Learning (RL) or Deep Reinforcement Learning (DRL) have been recently used to enable each network entity as an agent to learn an optimal decision-making policy adaptively through interacting with the unknown environments. However, such an algorithm fails to model the cooperations or competitions among network entities, and simply treats other entities as a part of the environment that may result in the non-stationarity issue. Multi-agent Reinforcement Learning (MARL) allows each network entity to learn its optimal policy by observing not only the environments, but also other entities' policies. As a result, MARL can significantly improve the learning efficiency of the network entities, and it has been recently used to solve various issues in the emerging networks. In this paper, we thus review the applications of MARL in the emerging networks. In particular, we provide a tutorial of MARL and a comprehensive survey of applications of MARL in next generation Internet. In particular, we first introduce single-agent RL and MARL. Then, we review a number of applications of MARL to solve emerging issues in future Internet. The issues consist of network access, transmit power control, computation offloading, content caching, packet routing, trajectory design for UAV-aided networks, and network security issues.
Conditional density estimation is the task of estimating the probability of an event, conditioned on some inputs. A neural network can be used to compute the output distribution explicitly. For such a task, there are many ways to represent a continuous-domain distribution using the output of a neural network, but each comes with its own limitations for what distributions it can accurately render. If the family of functions is too restrictive, it will not be appropriate for many datasets. In this paper, we demonstrate the benefits of modeling free-form distributions using deconvolution. It has the advantage of being flexible, but also takes advantage of the topological smoothness offered by the deconvolution layers. We compare our method to a number of other density-estimation approaches, and show that our Deconvolutional Density Network (DDN) outperforms the competing methods on many artificial and real tasks, without committing to a restrictive parametric model.
Recently, the research of wireless sensing has achieved more intelligent results, and the intelligent sensing of human location and activity can be realized by means of WiFi devices. However, most of the current human environment perception work is limited to a single person's environment, because the environment in which multiple people exist is more complicated than the environment in which a single person exists. In order to solve the problem of human behavior perception in a multi-human environment, we first proposed a solution to achieve crowd counting (inferred population) using deep learning in a closed environment with WIFI signals - DeepCout, which is the first in a multi-human environment. step. Since the use of WiFi to directly count the crowd is too complicated, we use deep learning to solve this problem, use Convolutional Neural Network(CNN) to automatically extract the relationship between the number of people and the channel, and use Long Short Term Memory(LSTM) to resolve the dependencies of number of people and Channel State Information(CSI) . To overcome the massive labelled data required by deep learning method, we add an online learning mechanism to determine whether or not someone is entering/leaving the room by activity recognition model, so as to correct the deep learning model in the fine-tune stage, which, in turn, reduces the required training data and make our method evolving over time. The system of DeepCount is performed and evaluated on the commercial WiFi devices. By massive training samples, our end-to-end learning approach can achieve an average of 86.4% prediction accuracy in an environment of up to 5 people. Meanwhile, by the amendment mechanism of the activity recognition model to judge door switch to get the variance of crowd to amend deep learning predicted results, the accuracy is up to 90%.