We consider the adversarial online multi-task reinforcement learning setting, where in each of $K$ episodes the learner is given an unknown task taken from a finite set of $M$ unknown finite-horizon MDP models. The learner's objective is to minimize its regret with respect to the optimal policy for each task. We assume the MDPs in $\mathcal{M}$ are well-separated under a notion of $\lambda$-separability, and show that this notion generalizes many task-separability notions from previous works. We prove a minimax lower bound of $\Omega(K\sqrt{DSAH})$ on the regret of any learning algorithm and an instance-specific lower bound of $\Omega(\frac{K}{\lambda^2})$ in sample complexity for a class of uniformly-good cluster-then-learn algorithms. We use a novel construction called 2-JAO MDP for proving the instance-specific lower bound. The lower bounds are complemented with a polynomial time algorithm that obtains $\tilde{O}(\frac{K}{\lambda^2})$ sample complexity guarantee for the clustering phase and $\tilde{O}(\sqrt{MK})$ regret guarantee for the learning phase, indicating that the dependency on $K$ and $\frac{1}{\lambda^2}$ is tight.
This paper presents a new method for integrated time-optimal routing and trajectory optimization of multirotor unmanned aerial vehicles (UAVs). Our approach extends the well-known Traveling Salesman Problem by accounting for the limited maneuverability of the UAVs due to their kinematic properties. To this end, we allow each waypoint to be traversed with a discretized velocity as well as a discretized flight direction and compute time-optimal trajectories to determine the travel time costs for each edge. We refer to this novel optimization problem as the Trajectory-based Traveling Salesman Problem (TBTSP). The results show that compared to a state-of-the-art approach for Traveling Salesman Problems with kinematic restrictions of UAVs, we can decrease mission duration by up to 15\%.
Ultra-wideband (UWB) technology has become very popular for indoor positioning and distance estimation (DE) systems due to its decimeter-level accuracy achieved when using time-of-flight-based techniques. Techniques for DE relying on signal strength (DESS) received less attention. As a consequence, existing benchmarks consist of simple channel characterizations rather than methods aiming to increase accuracy. Further development in DESS may enable lower-cost transceivers to applications that can afford lower accuracies than those based on time-of-flight. Moreover, it is a fundamental building block used by a recently proposed approach that can enable security against cyberattacks on DE which could not be avoided using only time-of-flight-based techniques. In this paper, we evaluate the suitability of several machine-learning models trained in different real-world environments to increase UWB-based DESS accuracy. Additionally, aiming for implementation in commercial off-the-shelf (COTS) transceivers, we propose and evaluate an approach to resolve ambiguities comprising DESS in these devices. Our results show that the proposed DE approaches have sub-decimeter accuracy when testing the models in the same environment and positions in which they have been trained, and achieved an average MAE of 24 cm when tested in a different environment. 3 datasets obtained from our experiments are made publicly available.
In this paper, we present a wireless ECG-derived Respiration Rate (RR) estimation using an autoencoder with a DCT Layer. The wireless wearable system records the ECG data of the subject and the respiration rate is determined from the variations in the baseline level of the ECG data. A straightforward Fourier analysis of the ECG data obtained using the wireless wearable system may lead to incorrect results due to uneven breathing. To improve the estimation precision, we propose a neural network that uses a novel Discrete Cosine Transform (DCT) layer to denoise and decorrelates the data. The DCT layer has trainable weights and soft-thresholds in the transform domain. In our dataset, we improve the Mean Squared Error (MSE) and Mean Absolute Error (MAE) of the Fourier analysis-based approach using our novel neural network with the DCT layer.
The decomposition of a time series is an essential task that helps to understand its very nature. It facilitates the analysis and forecasting of complex time series expressing various hidden components such as the trend, seasonal components, cyclic components and irregular fluctuations. Therefore, it is crucial in many fields for forecasting and decision processes. In recent years, many methods of time series decomposition have been developed, which extract and reveal different time series properties. Unfortunately, they neglect a very important property, i.e. time series variance. To deal with heteroscedasticity in time series, the method proposed in this work -- a seasonal-trend-dispersion decomposition (STD) -- extracts the trend, seasonal component and component related to the dispersion of the time series. We define STD decomposition in two ways: with and without an irregular component. We show how STD can be used for time series analysis and forecasting.
The rapid outbreak of COVID-19 pandemic invoked scientists and researchers to prepare the world for future disasters. During the pandemic, global authorities on healthcare urged the importance of disinfection of objects and surfaces. To implement efficient and safe disinfection services during the pandemic, robots have been utilized for indoor assets. In this paper, we envision the use of drones for disinfection of outdoor assets in hospitals and other facilities. Such heterogeneous assets may have different service demands (e.g., service time, quantity of the disinfectant material etc.), whereas drones have typically limited capacity (i.e., travel time, disinfectant carrying capacity). To serve all the facility assets in an efficient manner, the drone to assets allocation and drone travel routes must be optimized. In this paper, we formulate the capacitated vehicle routing problem (CVRP) to find optimal route for each drone such that the total service time is minimized, while simultaneously the drones meet the demands of each asset allocated to it. The problem is solved using mixed integer programming (MIP). As CVRP is an NP-hard problem, we propose a lightweight heuristic to achieve sub-optimal performance while reducing the time complexity in solving the problem involving a large number of assets.
A track-before-detect (TBD) particle filter-based method for detection and tracking of low observable objects based on a sequence of image frames in the presence of noise and clutter is studied. At each time instance after receiving a frame of image, first, some preprocessing approaches are applied to the image. Then, it is sent to the detection and tracking algorithm which is based on a particle filter. Performance of the approach is evaluated for detection and tracking of an object in different scenarios including noise and clutter.
Traffic flow prediction is an important part of smart transportation. The goal is to predict future traffic conditions based on historical data recorded by sensors and the traffic network. As the city continues to build, parts of the transportation network will be added or modified. How to accurately predict expanding and evolving long-term streaming networks is of great significance. To this end, we propose a new simulation-based criterion that considers teaching autonomous agents to mimic sensor patterns, planning their next visit based on the sensor's profile (e.g., traffic, speed, occupancy). The data recorded by the sensor is most accurate when the agent can perfectly simulate the sensor's activity pattern. We propose to formulate the problem as a continuous reinforcement learning task, where the agent is the next flow value predictor, the action is the next time-series flow value in the sensor, and the environment state is a dynamically fused representation of the sensor and transportation network. Actions taken by the agent change the environment, which in turn forces the agent's mode to update, while the agent further explores changes in the dynamic traffic network, which helps the agent predict its next visit more accurately. Therefore, we develop a strategy in which sensors and traffic networks update each other and incorporate temporal context to quantify state representations evolving over time.
In data-driven systems, data exploration is imperative for making real-time decisions. However, big data is stored in massive databases that are difficult to retrieve. Approximate Query Processing (AQP) is a technique for providing approximate answers to aggregate queries based on a summary of the data (synopsis) that closely replicates the behavior of the actual data, which can be useful where an approximate answer to the queries would be acceptable in a fraction of the real execution time. In this paper, we discuss the use of Generative Adversarial Networks (GANs) for generating tabular data that can be employed in AQP for synopsis construction. We first discuss the challenges associated with constructing synopses in relational databases and then introduce solutions to those challenges. Following that, we organized statistical metrics to evaluate the quality of the generated synopses. We conclude that tabular data complexity makes it difficult for algorithms to understand relational database semantics during training, and improved versions of tabular GANs are capable of constructing synopses to revolutionize data-driven decision-making systems.
Tissue typology annotation in Whole Slide histological images is a complex and tedious, yet necessary task for the development of computational pathology models. We propose to address this problem by applying Open Set Recognition techniques to the task of jointly classifying tissue that belongs to a set of annotated classes, e.g. clinically relevant tissue categories, while rejecting in test time Open Set samples, i.e. images that belong to categories not present in the training set. To this end, we introduce a new approach for Open Set histopathological image recognition based on training a model to accurately identify image categories and simultaneously predict which data augmentation transform has been applied. In test time, we measure model confidence in predicting this transform, which we expect to be lower for images in the Open Set. We carry out comprehensive experiments in the context of colorectal cancer assessment from histological images, which provide evidence on the strengths of our approach to automatically identify samples from unknown categories. Code is released at https://github.com/agaldran/t3po .