We present XCM, an eXplainable Convolutional neural network for Multivariate time series classification. XCM is a new compact convolutional neural network which extracts, in parallel, information relative to the observed variables and time from the input data. Thus, XCM architecture enables faithful explainability based on a post-hoc model-specific method (Gradient-weighted Class Activation Mapping), which identifies the observed variables and timestamps of the input data that are important for predictions. Our evaluation firstly shows that XCM outperforms the state-of-the-art multivariate time series classifiers on both the large and small public UEA datasets. Furthermore, following the illustration of the performance and explainability of XCM on a synthetic dataset, we present how XCM can outperform the current most accurate state-of-the-art algorithm on a real-world application while enhancing explainability by providing faithful and more informative explanations.
In this paper, we present a novel multi-channel speech extraction system to simultaneously extract multiple clean individual sources from a mixture in noisy and reverberant environments. The proposed method is built on an improved multi-channel time-domain speech separation network which employs speaker embeddings to identify and extract multiple targets without label permutation ambiguity. To efficiently inform the speaker information to the extraction model, we propose a new speaker conditioning mechanism by designing an additional speaker branch for receiving external speaker embeddings. Experiments on 2-channel WHAMR! data show that the proposed system improves by 9% relative the source separation performance over a strong multi-channel baseline, and it increases the speech recognition accuracy by more than 16% relative over the same baseline.
This article proposes an extension for standard time-series regression tree modelling to handle predictors that show irregularities such as missing observations, periodic patterns in the form of seasonality and cycles, and non-stationary trends. In doing so, this approach permits also to enrich the information set used in tree-based autoregressions via unobserved components. Furthermore, this manuscript also illustrates a relevant approach to control over-fitting based on ensemble learning and recent developments in the jackknife literature. This is strongly beneficial when the number of observed time periods is small and advantageous compared to benchmark resampling methods. Empirical results show the benefits of predicting equity squared returns as a function of their own past and a set of macroeconomic data via factor-augmented tree ensembles, with respect to simpler benchmarks. As a by-product, this approach allows to study the real-time importance of economic news on equity volatility.
Acceleration and momentum are the de facto standard in modern applications of machine learning and optimization, yet the bulk of the work on implicit regularization focuses instead on unaccelerated methods. In this paper, we study the statistical risk of the iterates generated by Nesterov's accelerated gradient method and Polyak's heavy ball method, when applied to least squares regression, drawing several connections to explicit penalization. We carry out our analyses in continuous-time, allowing us to make sharper statements than in prior work, and revealing complex interactions between early stopping, stability, and the curvature of the loss function.
Artificial Neural Networks (ANNs) are increasingly being used within safety-critical Cyber-Physical Systems (CPSs). They are often co-located with traditional embedded software, and may perform advisory or control-based roles. It is important to validate both the timing and functional correctness of these systems. However, most approaches in the literature consider guaranteeing only the functionality of ANN based controllers. This issue stems largely from the implementation strategies used within common neural network frameworks -- their underlying source code is often simply unsuitable for formal techniques such as static timing analysis. As a result, developers of safety-critical CPS must rely on informal techniques such as measurement based approaches to prove correctness, techniques that provide weak guarantees at best. In this work we address this challenge. We propose a design pipeline whereby neural networks trained using the popular deep learning framework Keras are compiled to functionally equivalent C code. This C code is restricted to simple constructs that may be analysed by existing static timing analysis tools. As a result, if compiled to a suitable time-predictable platform all execution bounds may be statically derived. To demonstrate the benefits of our approach we execute an ANN trained to drive an autonomous vehicle around a race track. We compile the ANN to the Patmos time-predictable controller, and show that we can derive worst case execution timings.
Internet of Medical Things (IoMT) represents an application of the Internet of Things, where health professionals perform remote analysis of physiological data collected using sensors that are associated with patients, allowing real-time and permanent monitoring of the patient's health condition and the detection of possible diseases at an early stage. However, the use of wireless communication for data transfer exposes this data to cyberattacks, and the sensitive and private nature of this data may represent a prime interest for attackers. The use of traditional security methods on equipment that is limited in terms of storage and computing capacity is ineffective. In this context, we have performed a comprehensive survey to investigate the use of the intrusion detection system based on machine learning (ML) for IoMT security. We presented the generic three-layer architecture of IoMT, the security requirement of IoMT security. We review the various threats that can affect IoMT security and identify the advantage, disadvantages, methods, and datasets used in each solution based on ML. Then we provide some challenges and limitations of applying ML on each layer of IoMT, which can serve as direction for future study.
We study the stochastic multi-player multi-armed bandit problem. In this problem, $m$ players cooperate to maximize their total reward from $K > m$ arms. However the players cannot communicate and are penalized (e.g. receive no reward) if they pull the same arm at the same time. We ask whether it is possible to obtain optimal instance-dependent regret $\tilde{O}(1/\Delta)$ where $\Delta$ is the gap between the $m$-th and $m+1$-st best arms. Such guarantees were recently achieved in a model allowing the players to implicitly communicate through intentional collisions. We show that with no communication at all, such guarantees are, surprisingly, not achievable. In fact, obtaining the optimal $\tilde{O}(1/\Delta)$ regret for some regimes of $\Delta$ necessarily implies strictly sub-optimal regret in other regimes. Our main result is a complete characterization of the Pareto optimal instance-dependent trade-offs that are possible with no communication. Our algorithm generalizes that of Bubeck, Budzinski, and the second author and enjoys the same strong no-collision property, while our lower bound is based on a topological obstruction and holds even under full information.
Cyber Physical Systems (CPS) applications have agents that actuate in their local vicinity, while requiring measurements that capture the state of their larger environment to make actuation choices. These measurements are made by sensors and communicated over a network as update packets. Network resource constraints dictate that updates arrive at an agent intermittently and be aged on their arrival. This can be alleviated by providing an agent with a fast enough rate of estimates of the measurements. Often works on estimation assume knowledge of the dynamic model of the system being measured. However, as CPS applications become pervasive, such information may not be available in practice. In this work, we propose a novel deep neural network architecture that leverages Long Short Term Memory (LSTM) networks to learn estimates in a model-free setting using only updates received over the network. We detail an online algorithm that enables training of our architecture. The architecture is shown to provide good estimates of measurements of both a linear and a non-linear dynamic system. It learns good estimates even when the learning proceeds over a generic network setting in which the distributions that govern the rate and age of received measurements may change significantly over time. We demonstrate the efficacy of the architecture by comparing it with the baselines of the Time-varying Kalman Filter and the Unscented Kalman Filter. The architecture enables empirical insights with regards to maintaining the ages of updates at the estimator, which are used by it and also the baselines.
We consider the problem of solving random parity games. We prove that parity games exibit a phase transition threshold above $d_P$, so that when the degree of the graph that defines the game has a degree $d > d_P$ then there exists a polynomial time algorithm that solves the game with high probability when the number of nodes goes to infinity. We further propose the SWCP (Self-Winning Cycles Propagation) algorithm and show that, when the degree is large enough, SWCP solves the game with high probability. Furthermore, the complexity of SWCP is polynomial $O\Big(|{\cal V}|^2 + |{\cal V}||{\cal E}|\Big)$. The design of SWCP is based on the threshold for the appearance of particular types of cycles in the players' respective subgraphs. We further show that non-sparse games can be solved in time $O(|{\cal V}|)$ with high probability, and emit a conjecture concerning the hardness of the $d=2$ case.
Automatic segmentation of lung lesions associated with COVID-19 in CT images requires large amount of annotated volumes. Annotations mandate expert knowledge and are time-intensive to obtain through fully manual segmentation methods. Additionally, lung lesions have large inter-patient variations, with some pathologies having similar visual appearance as healthy lung tissues. This poses a challenge when applying existing semi-automatic interactive segmentation techniques for data labelling. To address these challenges, we propose an efficient convolutional neural networks (CNNs) that can be learned online while the annotator provides scribble-based interaction. To accelerate learning from only the samples labelled through user-interactions, a patch-based approach is used for training the network. Moreover, we use weighted cross-entropy loss to address the class imbalance that may result from user-interactions. During online inference, the learned network is applied to the whole input volume using a fully convolutional approach. We compare our proposed method with state-of-the-art and show that it outperforms existing methods on the task of annotating lung lesions associated with COVID-19, achieving 16% higher Dice score while reducing execution time by 3$\times$ and requiring 9000 lesser scribbles-based labelled voxels. Due to the online learning aspect, our approach adapts quickly to user input, resulting in high quality segmentation labels. Source code will be made available upon acceptance.