



Abstract:Compressed Stochastic Gradient Descent (SGD) algorithms have been recently proposed to address the communication bottleneck in distributed and decentralized optimization problems, such as those that arise in federated machine learning. Existing compressed SGD algorithms assume the use of non-adaptive step-sizes(constant or diminishing) to provide theoretical convergence guarantees. Typically, the step-sizes are fine-tuned in practice to the dataset and the learning algorithm to provide good empirical performance. Such fine-tuning might be impractical in many learning scenarios, and it is therefore of interest to study compressed SGD using adaptive step-sizes. Motivated by prior work on adaptive step-size methods for SGD to train neural networks efficiently in the uncompressed setting, we develop an adaptive step-size method for compressed SGD. In particular, we introduce a scaling technique for the descent step in compressed SGD, which we use to establish order-optimal convergence rates for convex-smooth and strong convex-smooth objectives under an interpolation condition and for non-convex objectives under a strong growth condition. We also show through simulation examples that without this scaling, the algorithm can fail to converge. We present experimental results on deep neural networks for real-world datasets, and compare the performance of our proposed algorithm with previously proposed compressed SGD methods in literature, and demonstrate improved performance on ResNet-18, ResNet-34 and DenseNet architectures for CIFAR-100 and CIFAR-10 datasets at various levels of compression.




Abstract:We study the problem of Out-of-Distribution (OOD) detection, that is, detecting whether a learning algorithm's output can be trusted at inference time. While a number of tests for OOD detection have been proposed in prior work, a formal framework for studying this problem is lacking. We propose a definition for the notion of OOD that includes both the input distribution and the learning algorithm, which provides insights for the construction of powerful tests for OOD detection. We propose a multiple hypothesis testing inspired procedure to systematically combine any number of different statistics from the learning algorithm using conformal p-values. We further provide strong guarantees on the probability of incorrectly classifying an in-distribution sample as OOD. In our experiments, we find that threshold-based tests proposed in prior work perform well in specific settings, but not uniformly well across different types of OOD instances. In contrast, our proposed method that combines multiple statistics performs uniformly well across different datasets and neural networks.




Abstract:We study a monitoring system in which the distributions of sensors' observations change from a nominal distribution to an abnormal distribution in response to an adversary's presence. The system uses the quickest change detection procedure, the Shewhart rule, to detect the adversary that uses its resources to affect the abnormal distribution, so as to hide its presence. The metric of interest is the probability of missed detection within a predefined number of time-slots after the changepoint. Assuming that the adversary's resource constraints are known to the detector, we find the number of required sensors to make the worst-case probability of missed detection less than an acceptable level. The distributions of observations are assumed to be Gaussian, and the presence of the adversary affects their mean. We also provide simulation results to support our analysis.



Abstract:The problem of quickest detection of a change in the distribution of a sequence of independent observations is considered. The pre-change distribution is assumed to be known and stationary, while the post-change distributions are assumed to evolve in a pre-determined non-stationary manner with some possible parametric uncertainty. In particular, it is assumed that the cumulative KL divergence between the post-change and the pre-change distributions grows super-linearly with time after the change-point. For the case where the post-change distributions are known, a universal asymptotic lower bound on the delay is derived, as the false alarm rate goes to zero. Furthermore, a window-limited CuSum test is developed, and shown to achieve the lower bound asymptotically. For the case where the post-change distributions have parametric uncertainty, a window-limited generalized likelihood-ratio test is developed and is shown to achieve the universal lower bound asymptotically. Extensions to the case with dependent observations are discussed. The analysis is validated through numerical results on synthetic data. The use of the window-limited generalized likelihood-ratio test in monitoring pandemics is also demonstrated.




Abstract:The problem of quickest detection of a change in the mean of a sequence of independent observations is studied. The pre-change distribution is assumed to be stationary, while the post-change distributions are allowed to be non-stationary. The case where the pre-change distribution is known is studied first, and then the extension where only the mean and variance of the pre-change distribution are known. No knowledge of the post-change distributions is assumed other than that their means are above some pre-specified threshold larger than the pre-change mean. For the case where the pre-change distribution is known, a test is derived that asymptotically minimizes the worst-case detection delay over all possible post-change distributions, as the false alarm rate goes to zero. Towards deriving this asymptotically optimal test, some new results are provided for the general problem of asymptotic minimax robust quickest change detection in non-stationary settings. Then, the limiting form of the optimal test is studied as the gap between the pre- and post-change means goes to zero, called the Mean-Change Test (MCT). It is shown that the MCT can be designed with only knowledge of the mean and variance of the pre-change distribution. The performance of the MCT is also characterized when the mean gap is moderate, under the additional assumption that the distributions of the observations have bounded support. The analysis is validated through numerical results for detecting a change in the mean of a beta distribution. The use of the MCT in monitoring pandemics is also demonstrated.




Abstract:To achieve high data rates and better connectivity in future communication networks, the deployment of different types of access points (APs) is underway. In order to limit human intervention and reduce costs, the APs are expected to be equipped with self-organizing capabilities. Moreover, due to the spectrum crunch, frequency reuse among the deployed APs is inevitable, aggravating the problem of inter-cell interference (ICI). Therefore, ICI mitigation in self-organizing networks (SONs) is commonly identified as a key radio resource management mechanism to enhance performance in future communication networks. With the aim of reducing ICI in a SON, this paper proposes a novel solution for the uncoordinated channel and power allocation problems. Based on the multi-player multi-armed bandit (MAB) framework, the proposed technique does not require any communication or coordination between the APs. The case of varying channel rewards across APs is considered. In contrast to previous work on channel allocation using the MAB framework, APs are permitted to choose multiple channels for transmission. Moreover, non-orthogonal multiple access (NOMA) is used to allow multiple APs to access each channel simultaneously. This results in an MAB model with varying channel rewards, multiple plays and non-zero reward on collision. The proposed algorithm has an expected regret in the order of O(log^2 T ), which is validated by simulation results. Extensive numerical results also reveal that the proposed technique significantly outperforms the well-known upper confidence bound (UCB) algorithm, by achieving more than a twofold increase in the energy efficiency.


Abstract:We study the problem of quickest detection of a change in the mean of an observation sequence, under the assumption that both the pre- and post-change distributions have bounded support. We first study the case where the pre-change distribution is known, and then study the extension where only the mean and variance of the pre-change distribution are known. In both cases, no knowledge of the post-change distribution is assumed other than that it has bounded support. For the case where the pre-change distribution is known, we derive a test that asymptotically minimizes the worst-case detection delay over all post-change distributions, as the false alarm rate goes to zero. We then study the limiting form of the optimal test as the gap between the pre- and post-change means goes to zero, which we call the Mean-Change Test (MCT). We show that the MCT can be designed with only knowledge of the mean and variance of the pre-change distribution. We validate our analysis through numerical results for detecting a change in the mean of a beta distribution. We also demonstrate the use of the MCT for pandemic monitoring.

Abstract:A stochastic multi-user multi-armed bandit framework is used to develop algorithms for uncoordinated spectrum access. In contrast to prior work, it is assumed that rewards can be non-zero even under collisions, thus allowing for the number of users to be greater than the number of channels. The proposed algorithm consists of an estimation phase and an allocation phase. It is shown that if every user adopts the algorithm, the system wide regret is order-optimal of order $O(\log T)$ over a time-horizon of duration $T$. The regret guarantees hold for both the cases where the number of users is greater than or less than the number of channels. The algorithm is extended to the dynamic case where the number of users in the system evolves over time, and is shown to lead to sub-linear regret.




Abstract:We study the robust mean estimation problem in high dimensions, where $\alpha <0.5$ fraction of the data points can be arbitrarily corrupted. Motivated by compressive sensing, we formulate the robust mean estimation problem as the minimization of the $\ell_0$-`norm' of the outlier indicator vector, under second moment constraints on the inlier data points. We prove that the global minimum of this objective is order optimal for the robust mean estimation problem, and we propose a general framework for minimizing the objective. We further leverage the $\ell_1$ and $\ell_p$ $(0<p<1)$, minimization techniques in compressive sensing to provide computationally tractable solutions to the $\ell_0$ minimization problem. Both synthetic and real data experiments demonstrate that the proposed algorithms significantly outperform state-of-the-art robust mean estimation methods.
Abstract:Multi-user multi-armed bandits have emerged as a good model for uncoordinated spectrum access problems. In this paper we consider the scenario where users cannot communicate with each other. In addition, the environment may appear differently to different users, ${i.e.}$, the mean rewards as observed by different users for the same channel may be different. With this setup, we present a policy that achieves a regret of $O (\log{T})$. This paper has been accepted at Asilomar Conference on Signals, Systems, and Computers 2019.