Abstract:Collecting the most informative data from a large dataset distributed over a network is a fundamental problem in many fields, including control, signal processing and machine learning. In this paper, we establish a connection between selecting the most informative data and finding the top-$k$ elements of a multiset. The top-$k$ selection in a network can be formulated as a distributed nonsmooth convex optimization problem known as quantile estimation. Unfortunately, the lack of smoothness in the local objective functions leads to extremely slow convergence and poor scalability with respect to the network size. To overcome the deficiency, we propose an accelerated method that employs smoothing techniques. Leveraging the piecewise linearity of the local objective functions in quantile estimation, we characterize the iteration complexity required to achieve top-$k$ selection, a challenging task due to the lack of strong convexity. Several numerical results are provided to validate the effectiveness of the algorithm and the correctness of the theory.
Abstract:Many large-scale distributed multi-agent systems exchange information over low-power communication networks. In particular, agents intermittently communicate state and control signals in robotic network applications, often with limited power over an unlicensed spectrum, prone to eavesdropping and denial-of-service attacks. In this paper, we argue that a widely popular low-power communication protocol known as LoRa is vulnerable to denial-of-service attacks by an unauthenticated attacker if it can successfully identify a target signal's bandwidth and spreading factor. Leveraging a structural pattern in the LoRa signal's instantaneous frequency representation, we relate the problem of jointly inferring the two unknown parameters to a classification problem, which can be efficiently implemented using neural networks.


Abstract:The problem of communicating sensor measurements over shared networks is prevalent in many modern large-scale distributed systems such as cyber-physical systems, wireless sensor networks, and the internet of things. Due to bandwidth constraints, the system designer must jointly design decentralized medium access transmission and estimation policies that accommodate a very large number of devices in extremely contested environments such that the collection of all observations is reproduced at the destination with the best possible fidelity. We formulate a remote estimation problem in the mean-field regime where a very large number of sensors communicate their observations to an access point, or base station, under a strict constraint on the maximum fraction of transmitting devices. We show that in the mean-field regime, this problem exhibits a structure that enables tractable optimization algorithms. More importantly, we obtain a data-driven learning scheme that admits a finite sample-complexity guarantee on the performance of the resulting estimation system under minimal assumptions on the data's probability density function.




Abstract:Sensor scheduling is a well studied problem in signal processing and control with numerous applications. Despite its successful history, most of the related literature assumes the knowledge of the underlying probabilistic model of the sensor measurements such as the correlation structure or the entire joint probability density function. Herein, a framework for sensor scheduling for remote estimation is introduced in which the system design and the scheduling decisions are based solely on observed data. Unicast and broadcast networks and corresponding receivers are considered. In both cases, the empirical risk minimization can be posed as a difference-of-convex optimization problem and locally optimal solutions are obtained efficiently by applying the convex-concave procedure. Our results are independent of the data's probability density function, correlation structure and the number of sensors.