Privacy in speech and audio has many facets. A particularly under-developed area of privacy in this domain involves consideration for information related to content and context. Speech content can include words and their meaning or even stylistic markers, pathological speech, intonation patterns, or emotion. More generally, audio captured in-the-wild may contain background speech or reveal contextual information such as markers of location, room characteristics, paralinguistic sounds, or other audible events. Audio recording devices and speech technologies are becoming increasingly commonplace in everyday life. At the same time, commercialised speech and audio technologies do not provide consumers with a range of privacy choices. Even where privacy is regulated or protected by law, technical solutions to privacy assurance and enforcement fall short. This position paper introduces three important and timely research challenges for content privacy in speech and audio. We highlight current gaps and opportunities, and identify focus areas, that could have significant implications for developing ethical and safer speech technologies.
We present a novel method for the safety verification of nonlinear dynamical models that uses neural networks to represent abstractions of their dynamics. Neural networks have extensively been used before as approximators; in this work, we make a step further and use them for the first time as abstractions. For a given dynamical model, our method synthesises a neural network that overapproximates its dynamics by ensuring an arbitrarily tight, formally certified bound on the approximation error. For this purpose, we employ a counterexample-guided inductive synthesis procedure. We show that this produces a neural ODE with non-deterministic disturbances that constitutes a formal abstraction of the concrete model under analysis. This guarantees a fundamental property: if the abstract model is safe, i.e., free from any initialised trajectory that reaches an undesirable state, then the concrete model is also safe. By using neural ODEs with ReLU activation functions as abstractions, we cast the safety verification problem for nonlinear dynamical models into that of hybrid automata with affine dynamics, which we verify using SpaceEx. We demonstrate that our approach performs comparably to the mature tool Flow* on existing benchmark nonlinear models. We additionally demonstrate and that it is effective on models that do not exhibit local Lipschitz continuity, which are out of reach to the existing technologies.
Constrained Markov decision processes (CMDPs) model scenarios of sequential decision making with multiple objectives that are increasingly important in many applications. However, the model is often unknown and must be learned online while still ensuring the constraint is met, or at least the violation is bounded with time. Some recent papers have made progress on this very challenging problem but either need unsatisfactory assumptions such as knowledge of a safe policy, or have high cumulative regret. We propose the Safe PSRL (posterior sampling-based RL) algorithm that does not need such assumptions and yet performs very well, both in terms of theoretical regret bounds as well as empirically. The algorithm achieves an efficient tradeoff between exploration and exploitation by use of the posterior sampling principle, and provably suffers only bounded constraint violation by leveraging the idea of pessimism. Our approach is based on a primal-dual approach. We establish a sub-linear $\tilde{\mathcal{ O}}\left(H^{2.5} \sqrt{|\mathcal{S}|^2 |\mathcal{A}| K} \right)$ upper bound on the Bayesian reward objective regret along with a bounded, i.e., $\tilde{\mathcal{O}}\left(1\right)$ constraint violation regret over $K$ episodes for an $|\mathcal{S}|$-state, $|\mathcal{A}|$-action and horizon $H$ CMDP.
In autonomous robot exploration tasks, a mobile robot needs to actively explore and map an unknown environment as fast as possible. Since the environment is being revealed during exploration, the robot needs to frequently re-plan its path online, as new information is acquired by onboard sensors and used to update its partial map. While state-of-the-art exploration planners are frontier- and sampling-based, encouraged by the recent development in deep reinforcement learning (DRL), we propose ARiADNE, an attention-based neural approach to obtain real-time, non-myopic path planning for autonomous exploration. ARiADNE is able to learn dependencies at multiple spatial scales between areas of the agent's partial map, and implicitly predict potential gains associated with exploring those areas. This allows the agent to sequence movement actions that balance the natural trade-off between exploitation/refinement of the map in known areas and exploration of new areas. We experimentally demonstrate that our method outperforms both learning and non-learning state-of-the-art baselines in terms of average trajectory length to complete exploration in hundreds of simplified 2D indoor scenarios. We further validate our approach in high-fidelity Robot Operating System (ROS) simulations, where we consider a real sensor model and a realistic low-level motion controller, toward deployment on real robots.
Convolutional neural networks (CNN) have been successful in machine learning applications. Their success relies on their ability to consider space invariant local features. We consider the use of CNN to fit nuisance models in semiparametric estimation of the average causal effect of a treatment. In this setting, nuisance models are functions of pre-treatment covariates that need to be controlled for. In an application where we want to estimate the effect of early retirement on a health outcome, we propose to use CNN to control for time-structured covariates. Thus, CNN is used when fitting nuisance models explaining the treatment and the outcome. These fits are then combined into an augmented inverse probability weighting estimator yielding efficient and uniformly valid inference. Theoretically, we contribute by providing rates of convergence for CNN equipped with the rectified linear unit activation function and compare it to an existing result for feedforward neural networks. We also show when those rates guarantee uniformly valid inference. A Monte Carlo study is provided where the performance of the proposed estimator is evaluated and compared with other strategies. Finally, we give results on a study of the effect of early retirement on hospitalization using data covering the whole Swedish population.
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR). In this paper, we aim to improve distillation methods that pave the way for the deployment of such models in practice. The proposed distillation approach supports both retrieval and re-ranking stages and crucially leverages the relative geometry among queries and documents learned by the large teacher model. It goes beyond existing distillation methods in the IR literature, which simply rely on the teacher's scalar scores over the training data, on two fronts: providing stronger signals about local geometry via embedding matching and attaining better coverage of data manifold globally via query generation. Embedding matching provides a stronger signal to align the representations of the teacher and student models. At the same time, query generation explores the data manifold to reduce the discrepancies between the student and teacher where training data is sparse. Our distillation approach is theoretically justified and applies to both dual encoder (DE) and cross-encoder (CE) models. Furthermore, for distilling a CE model to a DE model via embedding matching, we propose a novel dual pooling-based scorer for the CE model that facilitates a distillation-friendly embedding geometry, especially for DE student models.
In this paper we propose a new method for the automatic recognition of the state of behavioral sleep (BS) and waking state (WS) in freely moving rats using their electrocorticographic (ECoG) data. Three-channels ECoG signals were recorded from frontal left, frontal right and occipital right cortical areas. We employed a simple artificial neural network (ANN), in which the mean values and standard deviations of ECoG signals from two or three channels were used as inputs for the ANN. Results of wavelet-based recognition of BS/WS in the same data were used to train the ANN and evaluate correctness of our classifier. We tested different combinations of ECoG channels for detecting BS/WS. Our results showed that the accuracy of ANN classification did not depend on ECoG-channel. For any ECoG-channel, networks were trained on one rat and applied to another rat with an accuracy of at least 80~\%. Itis important that we used a very simple network topology to achieve a relatively high accuracy of classification. Our classifier was based on a simple linear combination of input signals with some weights, and these weights could be replaced by the averaged weights of all trained ANNs without decreases in classification accuracy. In all, we introduce a new sleep recognition method that does not require additional network training. It is enough to know the coefficients and the equations suggested in this paper. The proposed method showed very fast performance and simple computations, therefore it could be used in real time experiments. It might be of high demand in preclinical studies in rodents that require vigilance control or monitoring of sleep-wake patterns.
This work is a study of acoustic non-reciprocity exhibited by a passive one-dimensional linear waveguide incorporating two local strongly nonlinear, asymmetric gates. Two local nonlinear gates break the symmetry and linearity of the waveguide, yielding strong global non-reciprocal acoustics, in the way that extremely different acoustical responses occur depending on the side of application of harmonic excitation. To the authors' best knowledge that the present two-gated waveguide is capable of extremely high acoustic non-reciprocity, at a much higher level to what is reported by active or passive devices in the current literature; moreover, this extreme performance combines with acceptable levels of transmissibility in the desired direction of wave propagation. Machine learning is utilized for predictive design of this gated waveguide in terms of the measures of transmissibility and non-reciprocity, with the aim of reducing the required computational time for high-dimensional parameter space analysis. The study sheds new light into the physics of these media and considers the advantages and limitations of using neural networks to analyze this type of physical problems. In the predicted desirable parameter space for intense non-reciprocity, the maximum transmissibility reaches as much as 40%, and the transmitted energy from upstream to downstream varies up to nine orders of magnitude, depending on the direction of wave transmission. The machine learning tools along with the numerical methods of this work can inform predictive designs of practical non-reciprocal waveguides and acoustic metamaterials that incorporate local nonlinear gates. The current paper shows that combinations of nonlinear gates can lead to extremely high non-reciprocity while maintaining desired levels of transmissibility.
Robotic shepherding is a bio-inspired approach to autonomously guiding a swarm of agents towards a desired location and has earned increasing research interest recently. However, shepherding a highly dispersed swarm in an obstructive environment remains challenging for the existing methods. To improve the shepherding efficacy in complex environments with obstacles and dispersed sheep, this paper proposes a planning-assisted autonomous shepherding framework with collision avoidance. The proposed approach transforms the swarm shepherding problem into a single Travelling Salesman Problem (TSP), with the sheepdog moving mode classified into non-interaction and interaction mode. Additionally, an adaptive switching approach is integrated into the framework to guide real-time path planning for avoiding collisions with obstacles and sometimes with sheep swarm. Then the overarching hierarchical mission planning system is presented, which consists of a grouping approach to obtain sheep sub-swarms, a general TSP solver for determining the optimal push sequence of sub-swarms, and an online path planner for calculating optimal paths for both sheepdogs and sheep. The experiments on a range of environments, both with and without obstacles, quantitatively demonstrate the effectiveness of the proposed shepherding framework and planning approaches.
The large number of ReLU non-linearity operations in existing deep neural networks makes them ill-suited for latency-efficient private inference (PI). Existing techniques to reduce ReLU operations often involve manual effort and sacrifice significant accuracy. In this paper, we first present a novel measure of non-linearity layers' ReLU sensitivity, enabling mitigation of the time-consuming manual efforts in identifying the same. Based on this sensitivity, we then present SENet, a three-stage training method that for a given ReLU budget, automatically assigns per-layer ReLU counts, decides the ReLU locations for each layer's activation map, and trains a model with significantly fewer ReLUs to potentially yield latency and communication efficient PI. Experimental evaluations with multiple models on various datasets show SENet's superior performance both in terms of reduced ReLUs and improved classification accuracy compared to existing alternatives. In particular, SENet can yield models that require up to ~2x fewer ReLUs while yielding similar accuracy. For a similar ReLU budget SENet can yield models with ~2.32% improved classification accuracy, evaluated on CIFAR-100.