This paper deals with the problem of localization in a cellular network in a dense urban scenario. Global Navigation Satellite Systems (GNSS) typically perform poorly in urban environments, where the likelihood of line-of-sight conditions is low, and thus alternative localization methods are required for good accuracy. We present LocUNet: A deep learning method for localization, based merely on Received Signal Strength (RSS) from Base Stations (BSs), which does not require any increase in computation complexity at the user devices with respect to the device standard operations, unlike methods that rely on time of arrival or angle of arrival information. In the proposed method, the user to be localized reports the RSS from BSs to a Central Processing Unit (CPU), which may be located in the cloud. Alternatively, the localization can be performed locally at the user. Using estimated pathloss radio maps of the BSs, LocUNet can localize users with state-of-the-art accuracy and enjoys high robustness to inaccuracies in the radio maps. The proposed method does not require pre-sampling of the environment; and is suitable for real-time applications, thanks to the RadioUNet, a neural network-based radio map estimator. We also introduce two datasets that allow numerical comparisons of RSS and Time of Arrival (ToA) methods in realistic urban environments.
Neural Tangent Kernel (NTK) is widely used to analyze overparametrized neural networks due to the famous result by (Jacot et al., 2018): in the infinite-width limit, the NTK is deterministic and constant during training. However, this result cannot explain the behavior of deep networks, since it generally does not hold if depth and width tend to infinity simultaneously. In this paper, we study the NTK of fully-connected ReLU networks with depth comparable to width. We prove that the NTK properties depend significantly on the depth-to-width ratio and the distribution of parameters at initialization. In fact, our results indicate the importance of the three phases in the hyperparameter space identified in (Poole et al., 2016): ordered, chaotic and the edge of chaos (EOC). We derive exact expressions for the NTK dispersion in the infinite-depth-and-width limit in all three phases and conclude that the NTK variability grows exponentially with depth at the EOC and in the chaotic phase but not in the ordered phase. We also show that the NTK of deep networks may stay constant during training only in the ordered phase and discuss how the structure of the NTK matrix changes during training.
We present CartoonX (Cartoon Explanation), a novel model-agnostic explanation method tailored towards image classifiers and based on the rate-distortion explanation (RDE) framework. Natural images are roughly piece-wise smooth signals -- also called cartoon images -- and tend to be sparse in the wavelet domain. CartoonX is the first explanation method to exploit this by requiring its explanations to be sparse in the wavelet domain, thus extracting the \emph{relevant piece-wise smooth} part of an image instead of relevant pixel-sparse regions. We demonstrate experimentally that CartoonX is not only highly interpretable due to its piece-wise smooth nature but also particularly apt at explaining misclassifications.
We present the Rate-Distortion Explanation (RDE) framework, a mathematically well-founded method for explaining black-box model decisions. The framework is based on perturbations of the target input signal and applies to any differentiable pre-trained model such as neural networks. Our experiments demonstrate the framework's adaptability to diverse data modalities, particularly images, audio, and physical simulations of urban environments.
We study spectral graph convolutional neural networks (GCNNs), where filters are defined as continuous functions of the graph shift operator (GSO) through functional calculus. A spectral GCNN is not tailored to one specific graph and can be transferred between different graphs. It is hence important to study the GCNN transferability: the capacity of the network to have approximately the same repercussion on different graphs that represent the same phenomenon. Transferability ensures that GCNNs trained on certain graphs generalize if the graphs in the test set represent the same phenomena as the graphs in the training set. In this paper, we consider a model of transferability based on graphon analysis. Graphons are limit objects of graphs, and, in the graph paradigm, two graphs represent the same phenomenon if both approximate the same graphon. Our main contributions can be summarized as follows: 1) we prove that any fixed GCNN with continuous filters is transferable under graphs that approximate the same graphon, 2) we prove transferability for graphs that approximate unbounded graphon shift operators, which are defined in this paper, and, 3) we obtain non-asymptotic approximation results, proving linear stability of GCNNs. This extends current state-of-the-art results which show asymptotic transferability for polynomial filters under graphs that approximate bounded graphons.
We present a deep learning-based algorithm to jointly solve a reconstruction problem and a wavefront set extraction problem in tomographic imaging. The algorithm is based on a recently developed digital wavefront set extractor as well as the well-known microlocal canonical relation for the Radon transform. We use the wavefront set information about x-ray data to improve the reconstruction by requiring that the underlying neural networks simultaneously extract the correct ground truth wavefront set and ground truth image. As a necessary theoretical step, we identify the digital microlocal canonical relations for deep convolutional residual neural networks. We find strong numerical evidence for the effectiveness of this approach.
This paper deals with the problem of localization in a cellular network in a dense urban scenario. Global Navigation Satellite Systems typically perform poorly in urban environments, where the likelihood of line-of-sight conditions between the devices and the satellites is low, and thus alternative localization methods are required for good accuracy. We present a deep learning method for localization, based merely on pathloss, which does not require any increase in computation complexity at the user devices with respect to the device standard operations, unlike methods that rely on time of arrival or angle of arrival information. In a wireless network, user devices scan the base station beacon slots and identify the few strongest base station signals for handover and user-base station association purposes. In the proposed method, the user to be localized simply reports such received signal strengths to a central processing unit, which may be located in the cloud. For each base station we have good approximation of the pathloss at every location in a dense grid in the map. This approximation is provided by RadioUNet, a deep learning-based simulator of pathloss functions in urban environment, that we have previously proposed and published. Using the estimated pathloss radio maps of all base stations and the corresponding reported signal strengths, the proposed deep learning algorithm can extract a very accurate localization of the user. The proposed method, called LocUNet, enjoys high robustness to inaccuracies in the estimated radio maps. We demonstrate this by numerical experiments, which obtain state-of-the-art results.
We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding what features are learned, why deep architectures perform exceptionally well in physical problems, and which fine aspects of an architecture affect the behavior of a learning task in which way. We present an overview of modern approaches that yield partial answers to these questions. For selected approaches, we describe the main ideas in more detail.
Neural Tangent Kernel (NTK) theory is widely used to study the dynamics of infinitely-wide deep neural networks (DNNs) under gradient descent. But do the results for infinitely-wide networks give us hints about the behaviour of real finite-width ones? In this paper we study empirically when NTK theory is valid in practice for fully-connected ReLu and sigmoid networks. We find out that whether a network is in the NTK regime depends on the hyperparameters of random initialization and network's depth. In particular, NTK theory does not explain behaviour of sufficiently deep networks initialized so that their gradients explode: the kernel is random at initialization and changes significantly during training, contrary to NTK theory. On the other hand, in case of vanishing gradients DNNs are in the NTK regime but become untrainable rapidly with depth. We also describe a framework to study generalization properties of DNNs by means of NTK theory and discuss its limits.