Due to the limited smartness and abilities of machine intelligence, currently autonomous vehicles are still unable to handle all kinds of situations and completely replace drivers. Because humans exhibit strong robustness and adaptability in complex driving scenarios, it is of great importance to introduce humans into the training loop of artificial intelligence, leveraging human intelligence to further advance machine learning algorithms. In this study, a real-time human-guidance-based deep reinforcement learning (Hug-DRL) method is developed for policy training of autonomous driving. Leveraging a newly designed control transfer mechanism between human and automation, human is able to intervene and correct the agent's unreasonable actions in real time when necessary during the model training process. Based on this human-in-the-loop guidance mechanism, an improved actor-critic architecture with modified policy and value networks is developed. The fast convergence of the proposed Hug-DRL allows real-time human guidance actions to be fused into the agent's training loop, further improving the efficiency and performance of deep reinforcement learning. The developed method is validated by human-in-the-loop experiments with 40 subjects and compared with other state-of-the-art learning approaches. The results suggest that the proposed method can effectively enhance the training efficiency and performance of the deep reinforcement learning algorithm under human guidance, without imposing specific requirements on participant expertise and experience.
This paper introduces VolMap, a real-time approach for the semantic segmentation of a 3D LiDAR surrounding view system in autonomous vehicles. We designed an optimized deep convolution neural network that can accurately segment the point cloud produced by a 360\degree{} LiDAR setup, where the input consists of a volumetric bird-eye view with LiDAR height layers used as input channels. We further investigated the usage of multi-LiDAR setup and its effect on the performance of the semantic segmentation task. Our evaluations are carried out on a large scale 3D object detection benchmark containing a LiDAR cocoon setup, along with KITTI dataset, where the per-point segmentation labels are derived from 3D bounding boxes. We show that VolMap achieved an excellent balance between high accuracy and real-time running on CPU.
Behaviour selection has been an active research topic for robotics, in particular in the field of human-robot interaction. For a robot to interact effectively and autonomously with humans, the coupling between techniques for human activity recognition, based on sensing information, and robot behaviour selection, based on decision-making mechanisms, is of paramount importance. However, most approaches to date consist of deterministic associations between the recognised activities and the robot behaviours, neglecting the uncertainty inherent to sequential predictions in real-time applications. In this paper, we address this gap by presenting a neurorobotics approach based on computational models that resemble neurophysiological aspects of living beings. This neurorobotics approach was compared to a non-bioinspired, heuristics-based approach. To evaluate both approaches, a robot simulation is developed, in which a mobile robot has to accomplish tasks according to the activity being performed by the inhabitant of an intelligent home. The outcomes of each approach were evaluated according to the number of correct outcomes provided by the robot. Results revealed that the neurorobotics approach is advantageous, especially considering the computational models based on more complex animals.
Attitude estimation or determination is a fundamental task for satellites to remain effectively operational. This task is furthermore complicated on small satellites by the limited space and computational power available on-board. This, coupled with a usually low budget, restricts small satellites from using high precision sensors for its especially important task of attitude estimation. On top of this, small satellites, on account of their size and weight, are comparatively more sensitive to environmental or orbital disturbances as compared to their larger counterparts. Magnetic disturbance forms the major contributor to orbital disturbances on small satellites in Lower Earth Orbits (LEO). This magnetic disturbance depends on the Residual Magnetic Moment (RMM) of the satellite itself, which for higher accuracy should be determined in real-time. This paper presents a method for in-orbit estimation of the satellite magnetic dipole using a Random Walk Model in order to circumnavigate the inaccuracy arising due to unknown orbital magnetic disturbances. It is also ensured that the dipole as well as attitude estimation of the satellite is done using only a magnetometer as the sensor.
Long-term autonomy in service robotics is a current research topic, especially for dynamic, large-scale environments that change over time. We present Sobi, a mobile service robot developed as an interactive guide for open environments, such as public places with indoor and outdoor areas. The robot will serve as a platform for environmental modeling and human-robot interaction. Its main hardware and software components, which we freely license as a documented open source project, are presented. Another key focus is Sobi's monitoring system for long-term autonomy, which restores system components in a targeted manner in order to extend the total system lifetime without unplanned intervention. We demonstrate first results of the long-term autonomous capabilities in a 16-day indoor deployment, in which the robot patrols a total of 66.6 km with an average of 5.5 hours of travel time per weekday, charging autonomously in between. In a user study with 12 participants, we evaluate the appearance and usability of the user interface, which allows users to interactively query information about the environment and directions.
Time series data analytics has been a problem of substantial interests for decades, and Dynamic Time Warping (DTW) has been the most widely adopted technique to measure dissimilarity between time series. A number of global-alignment kernels have since been proposed in the spirit of DTW to extend its use to kernel-based estimation method such as support vector machine. However, those kernels suffer from diagonal dominance of the Gram matrix and a quadratic complexity w.r.t. the sample size. In this work, we study a family of alignment-aware positive definite (p.d.) kernels, with its feature embedding given by a distribution of \emph{Random Warping Series (RWS)}. The proposed kernel does not suffer from the issue of diagonal dominance while naturally enjoys a \emph{Random Features} (RF) approximation, which reduces the computational complexity of existing DTW-based techniques from quadratic to linear in terms of both the number and the length of time-series. We also study the convergence of the RF approximation for the domain of time series of unbounded length. Our extensive experiments on 16 benchmark datasets demonstrate that RWS outperforms or matches state-of-the-art classification and clustering methods in both accuracy and computational time. Our code and data is available at { \url{https://github.com/IBM/RandomWarpingSeries}}.
We study the problem of stochastic combinatorial pure exploration (CPE), where an agent sequentially pulls a set of single arms (a.k.a. a super arm) and tries to find the best super arm. Among a variety of problem settings of the CPE, we focus on the full-bandit setting, where we cannot observe the reward of each single arm, but only the sum of the rewards. Although we can regard the CPE with full-bandit feedback as a special case of pure exploration in linear bandits, an approach based on linear bandits is not computationally feasible since the number of super arms may be exponential. In this paper, we first propose a polynomial-time bandit algorithm for the CPE under general combinatorial constraints and provide an upper bound of the sample complexity. Second, we design an approximation algorithm for the 0-1 quadratic maximization problem, which arises in many bandit algorithms with confidence ellipsoids. Based on our approximation algorithm, we propose novel bandit algorithms for the top-k selection problem, and prove that our algorithms run in polynomial time. Finally, we conduct experiments on synthetic and real-world datasets, and confirm the validity of our theoretical analysis in terms of both the computation time and the sample complexity.
Medical time-series datasets have unique characteristics that make prediction tasks challenging. Most notably, patient trajectories often contain longitudinal variations in their input-output relationships, generally referred to as temporal conditional shift. Designing sequence models capable of adapting to such time-varying distributions remains a prevailing problem. To address this we present Model-Attentive Ensemble learning for Sequence modeling (MAES). MAES is a mixture of time-series experts which leverages an attention-based gating mechanism to specialize the experts on different sequence dynamics and adaptively weight their predictions. We demonstrate that MAES significantly out-performs popular sequence models on datasets subject to temporal shift.
Domain adaptation is critical for success when confronting with the lack of annotations in a new domain. As the huge time consumption of labeling process on 3D point cloud, domain adaptation for 3D semantic segmentation is of great expectation. With the rise of multi-modal datasets, large amount of 2D images are accessible besides 3D point clouds. In light of this, we propose to further leverage 2D data for 3D domain adaptation by intra and inter domain cross modal learning. As for intra-domain cross modal learning, most existing works sample the dense 2D pixel-wise features into the same size with sparse 3D point-wise features, resulting in the abandon of numerous useful 2D features. To address this problem, we propose Dynamic sparse-to-dense Cross Modal Learning (DsCML) to increase the sufficiency of multi-modality information interaction for domain adaptation. For inter-domain cross modal learning, we further advance Cross Modal Adversarial Learning (CMAL) on 2D and 3D data which contains different semantic content aiming to promote high-level modal complementarity. We evaluate our model under various multi-modality domain adaptation settings including day-to-night, country-to-country and dataset-to-dataset, brings large improvements over both uni-modal and multi-modal domain adaptation methods on all settings.
Malware detection is a critical aspect of information security. One difficulty that arises is that malware often evolves over time. To maintain effective malware detection, it is necessary to determine when malware evolution has occurred so that appropriate countermeasures can be taken. We perform a variety of experiments aimed at detecting points in time where a malware family has likely evolved, and we consider secondary tests designed to confirm that evolution has actually occurred. Several malware families are analyzed, each of which includes a number of samples collected over an extended period of time. Our experiments indicate that improved results are obtained using feature engineering based on word embedding techniques. All of our experiments are based on machine learning models, and hence our evolution detection strategies require minimal human intervention and can easily be automated.