With increasing physical threats in recent years targeted at critical infrastructures, it is crucial to establish a reliable threat monitoring system integrating video surveillance and digital sensors based on cutting-edge technologies. A physical threat monitoring solution unifying the floorplan, cameras, and sensors for smart buildings has been set up in our study. Computer vision and deep learning models are used for video streams analysis. When a threat is detected by a rule engine based on the real-time analysis results combining with feedback from related digital sensors, an alert is sent to the Video Management System so that human operators can take further action. A physical threat monitoring system typically needs to address complex and even destructive incidents, such as fire, which is unrealistic to simulate in real life. Restrictions imposed during the Covid-19 pandemic and privacy concerns have added to the challenges. Our study utilises the Unreal Engine to simulate some typical suspicious and intrusion scenes with photorealistic qualities in the context of a virtual building. Add-on programs are implemented to transfer the video stream from virtual PTZ cameras to the Milestone Video Management System and enable users to control those cameras from the graphic client application. Virtual sensors such as fire alarms, temperature sensors and door access controls are implemented similarly, fulfilling the same programmatic VMS interface as real-life sensors. Thanks to this simulation system's extensibility and repeatability, we have consolidated this unified physical threat monitoring system and verified its effectiveness and user-friendliness. Both the simulated Unreal scenes and the software add-ons developed during this study are highly modulated and thereby are ready for reuse in future projects in this area.
In this paper, we propose a vocoder based on a pair of forward and reverse-time linear stochastic differential equations (SDE). The solutions of this SDE pair are two stochastic processes, one of which turns the distribution of wave, that we want to generate, into a simple and tractable distribution. The other is the generation procedure that turns this tractable simple signal into the target wave. The model is called It\^oWave. It\^oWave use the Wiener process as a driver to gradually subtract the excess signal from the noise signal to generate realistic corresponding meaningful audio respectively, under the conditional inputs of original mel spectrogram. The results of the experiment show that the mean opinion scores (MOS) of It\^oWave can exceed the current state-of-the-art (SOTA) methods, and reached 4.35$\pm$0.115. The generated audio samples are available online\footnotemark[2].
Drug Side-Effects (DSEs) have a high impact on public health, care system costs, and drug discovery processes. Predicting the probability of side-effects, before their occurrence, is fundamental to reduce this impact, in particular on drug discovery. Candidate molecules could be screened before undergoing clinical trials, reducing the costs in time, money, and health of the participants. Drug side-effects are triggered by complex biological processes involving many different entities, from drug structures to protein-protein interactions. To predict their occurrence, it is necessary to integrate data from heterogeneous sources. In this work, such heterogeneous data is integrated into a graph dataset, expressively representing the relational information between different entities, such as drug molecules and genes. The relational nature of the dataset represents an important novelty for drug side-effect predictors. Graph Neural Networks (GNNs) are exploited to predict DSEs on our dataset with very promising results. GNNs are deep learning models that can process graph-structured data, with minimal information loss, and have been applied on a wide variety of biological tasks. Our experimental results confirm the advantage of using relationships between data entities, suggesting interesting future developments in this scope. The experimentation also shows the importance of specific subsets of data in determining associations between drugs and side-effects.
We study regret minimization for infinite-horizon average-reward Markov Decision Processes (MDPs) under cost constraints. We start by designing a policy optimization algorithm with carefully designed action-value estimator and bonus term, and show that for ergodic MDPs, our algorithm ensures $\widetilde{O}(\sqrt{T})$ regret and constant constraint violation, where $T$ is the total number of time steps. This strictly improves over the algorithm of (Singh et al., 2020), whose regret and constraint violation are both $\widetilde{O}(T^{2/3})$. Next, we consider the most general class of weakly communicating MDPs. Through a finite-horizon approximation, we develop another algorithm with $\widetilde{O}(T^{2/3})$ regret and constraint violation, which can be further improved to $\widetilde{O}(\sqrt{T})$ via a simple modification, albeit making the algorithm computationally inefficient. As far as we know, these are the first set of provable algorithms for weakly communicating MDPs with cost constraints.
Intelligent Internet of Things (IoT) systems based on deep neural networks (DNNs) have been widely deployed in the real world. However, DNNs are found to be vulnerable to adversarial examples, which raises people's concerns about intelligent IoT systems' reliability and security. Testing and evaluating the robustness of IoT systems becomes necessary and essential. Recently various attacks and strategies have been proposed, but the efficiency problem remains unsolved properly. Existing methods are either computationally extensive or time-consuming, which is not applicable in practice. In this paper, we propose a novel framework called Attack-Inspired GAN (AI-GAN) to generate adversarial examples conditionally. Once trained, it can generate adversarial perturbations efficiently given input images and target classes. We apply AI-GAN on different datasets in white-box settings, black-box settings and targeted models protected by state-of-the-art defenses. Through extensive experiments, AI-GAN achieves high attack success rates, outperforming existing methods, and reduces generation time significantly. Moreover, for the first time, AI-GAN successfully scales to complex datasets e.g. CIFAR-100 and ImageNet, with about $90\%$ success rates among all classes.
Temporal abstraction in reinforcement learning (RL), offers the promise of improving generalization and knowledge transfer in complex environments, by propagating information more efficiently over time. Although option learning was initially formulated in a way that allows updating many options simultaneously, using off-policy, intra-option learning (Sutton, Precup & Singh, 1999), many of the recent hierarchical reinforcement learning approaches only update a single option at a time: the option currently executing. We revisit and extend intra-option learning in the context of deep reinforcement learning, in order to enable updating all options consistent with current primitive action choices, without introducing any additional estimates. Our method can therefore be naturally adopted in most hierarchical RL frameworks. When we combine our approach with the option-critic algorithm for option discovery, we obtain significant improvements in performance and data-efficiency across a wide variety of domains.
Sensor embedded glove systems have been reported to require careful, time consuming and precise calibrations on a per user basis in order to obtain consistent usable data. We have developed a low cost, flex sensor based smart glove system that may be resilient to the common limitations of data gloves. This system utilizes an Arduino based micro controller as well as a single flex sensor on each finger. Feedback from the Arduinos analog to digital converter can be used to infer objects dimensional properties, the reactions of each individual finger will differ with respect to the size and shape of a grasped object. In this work, we report its use in statistically differentiating stationary objects of spherical and cylindrical shapes of varying radii regardless of the variations introduced by gloves users. Using our sensor embedded glove system, we explored the practicability of object classification based on the tactile sensor responses from each finger of the smart glove. An estimated standard error of the mean was calculated from each of the of five fingers averaged flex sensor readings. Consistent with the literature, we found that there is a systematic dependence between an objects shape, dimension and the flex sensor readings. The sensor output from at least one finger, indicated a non-overlapping confidence interval when comparing spherical and cylindrical objects of the same radius. When sensing spheres and cylinders of varying sizes, all five fingers had a categorically varying reaction to each shape. We believe that our findings could be used in machine learning models for real-time object identification.
Continuous depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems. The common solution is to use the adjoint sensitivity method to replicate a forward-backward pass optimisation problem. We propose a new approach which explicates the network's `depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems. This new method is based on the principle of `Invariant Imbedding' for which we prove a general solution, applicable to all non-linear, vector-valued optimal control problems with both running and terminal loss. Our new architectures provide a tangible tool for inspecting the theoretical--and to a great extent unexplained--properties of network depth. They also constitute a resource of discrete implementations of Neural ODEs comparable to classes of imbedded residual neural networks. Through a series of experiments, we show the competitive performance of the proposed architectures for supervised learning and time series prediction.
The area of domain adaptation has been instrumental in addressing the domain shift problem encountered by many applications. This problem arises due to the difference between the distributions of source data used for training in comparison with target data used during realistic testing scenarios. In this paper, we introduce a novel MultiScale Domain Adaptive YOLO (MS-DAYOLO) framework that employs multiple domain adaptation paths and corresponding domain classifiers at different scales of the recently introduced YOLOv4 object detector. Building on our baseline multiscale DAYOLO framework, we introduce three novel deep learning architectures for a Domain Adaptation Network (DAN) that generates domain-invariant features. In particular, we propose a Progressive Feature Reduction (PFR), a Unified Classifier (UC), and an Integrated architecture. We train and test our proposed DAN architectures in conjunction with YOLOv4 using popular datasets. Our experiments show significant improvements in object detection performance when training YOLOv4 using the proposed MS-DAYOLO architectures and when tested on target data for autonomous driving applications. Moreover, MS-DAYOLO framework achieves an order of magnitude real-time speed improvement relative to Faster R-CNN solutions while providing comparable object detection performance.
A vehicle routing and crew scheduling problem (VRCSP) consists of simultaneously planning the routes of a fleet of vehicles and scheduling the crews, where the vehicle-crew correspondence is not fixed through time. This allows a greater planning flexibility and a more efficient use of the fleet, but in counterpart, a high synchronisation is demanded. In this work, we present a VRCSP where pickup-and-delivery requests with time windows have to be fulfilled over a given planning horizon by using trucks and drivers. Crews can be composed of 1 or 2 drivers and any of them can be relieved in a given set of locations. Moreover, they are allowed to travel among locations with non-company shuttles, at an additional cost that is minimised. As our problem considers distinct routes for trucks and drivers, we have an additional flexibility not contemplated in other previous VRCSP given in the literature where a crew is handled as an indivisible unit. We tackle this problem with a two-stage sequential approach: a set of truck routes is computed in the first stage and a set of driver routes consistent with the truck routes is obtained in the second one. We design and evaluate the performance of a metaheuristic based algorithm for the latter stage. Our algorithm is mainly a GRASP with a perturbation procedure that allows reusing solutions already found in case the search for new solutions becomes difficult. This procedure together with other to repair infeasible solutions allow us to find high-quality solutions on instances of 100 requests spread across 15 cities with a fleet of 12-32 trucks (depending on the planning horizon) in less than an hour. We also conclude that the possibility of carrying an additional driver leads to a decrease of the cost of external shuttles by about 60% on average with respect to individual crews and, in some cases, to remove this cost completely.