Data augmentation technique from computer vision has been widely considered as a regularization method to improve data efficiency and generalization performance in vision-based reinforcement learning. We variate the timing of using augmentation, which is, in turn, critical depending on tasks to be solved in training and testing. According to our experiments on Open AI Procgen Benchmark, if the regularization imposed by augmentation is helpful only in testing, it is better to procrastinate the augmentation after training than to use it during training in terms of sample and computation complexity. We note that some of such augmentations can disturb the training process. Conversely, an augmentation providing regularization useful in training needs to be used during the whole training period to fully utilize its benefit in terms of not only generalization but also data efficiency. These phenomena suggest a useful timing control of data augmentation in reinforcement learning.
Current Machine Translation (MT) systems achieve very good results on a growing variety of language pairs and datasets. However, they are known to produce fluent translation outputs that can contain important meaning errors, thus undermining their reliability in practice. Quality Estimation (QE) is the task of automatically assessing the performance of MT systems at test time. Thus, in order to be useful, QE systems should be able to detect such errors. However, this ability is yet to be tested in the current evaluation practices, where QE systems are assessed only in terms of their correlation with human judgements. In this work, we bridge this gap by proposing a general methodology for adversarial testing of QE for MT. First, we show that despite a high correlation with human judgements achieved by the recent SOTA, certain types of meaning errors are still problematic for QE to detect. Second, we show that on average, the ability of a given model to discriminate between meaning-preserving and meaning-altering perturbations is predictive of its overall performance, thus potentially allowing for comparing QE systems without relying on manual quality annotation.
The solution of a partial differential equation can be obtained by computing the inverse operator map between the input and the solution space. Towards this end, we introduce a \textit{multiwavelet-based neural operator learning scheme} that compresses the associated operator's kernel using fine-grained wavelets. By explicitly embedding the inverse multiwavelet filters, we learn the projection of the kernel onto fixed multiwavelet polynomial bases. The projected kernel is trained at multiple scales derived from using repeated computation of multiwavelet transform. This allows learning the complex dependencies at various scales and results in a resolution-independent scheme. Compare to the prior works, we exploit the fundamental properties of the operator's kernel which enable numerically efficient representation. We perform experiments on the Korteweg-de Vries (KdV) equation, Burgers' equation, Darcy Flow, and Navier-Stokes equation. Compared with the existing neural operator approaches, our model shows significantly higher accuracy and achieves state-of-the-art in a range of datasets. For the time-varying equations, the proposed method exhibits a ($2X-10X$) improvement ($0.0018$ ($0.0033$) relative $L2$ error for Burgers' (KdV) equation). By learning the mappings between function spaces, the proposed method has the ability to find the solution of a high-resolution input after learning from lower-resolution data.
Structures in the solar corona are the main drivers of space weather processes that might directly or indirectly affect the Earth. Thanks to the most recent space-based solar observatories, with capabilities to acquire high-resolution images continuously, the structures in the solar corona can be monitored over the years with a time resolution of minutes. For this purpose, we have developed a method for automatic segmentation of solar corona structures observed in EUV spectrum that is based on a deep learning approach utilizing Convolutional Neural Networks. The available input datasets have been examined together with our own dataset based on the manual annotation of the target structures. Indeed, the input dataset is the main limitation of the developed model's performance. Our \textit{SCSS-Net} model provides results for coronal holes and active regions that could be compared with other generally used methods for automatic segmentation. Even more, it provides a universal procedure to identify structures in the solar corona with the help of the transfer learning technique. The outputs of the model can be then used for further statistical studies of connections between solar activity and the influence of space weather on Earth.
We investigate the activity detection and channel estimation issues for cell-free Internet of Things (IoT) networks with massive random access. In each time slot, only partial devices are active and communicate with neighboring access points (APs) using non-orthogonal random pilot sequences. Different from the centralized processing in cellular networks, the activity detection and channel estimation in cell-free IoT is more challenging due to the distributed and user-centric architecture. We propose a two-stage approach to detect the random activities of devices and estimate their channel states. In the first stage, the activity of each device is jointly detected by its adjacent APs based on the vector approximate message passing (Vector AMP) algorithm. In the second stage, each AP re-estimates the channel using the linear minimum mean square error (LMMSE) method based on the detected activities to improve the channel estimation accuracy. We derive closed-form expressions for the activity detection error probability and the mean-squared channel estimation errors for a typical device. Finally, we analyze the performance of the entire cell-free IoT network in terms of coverage probability. Simulation results validate the derived closed-form expressions and show that the cell-free IoT significantly outperforms the collocated massive MIMO and small-cell schemes in terms of coverage probability.
Multi-label (ML) data deals with multiple classes associated with individual samples at the same time. This leads to the co-occurrence of several classes repeatedly, which indicates some existing correlation among them. In this article, the correlation among classes has been explored to improve the classification performance of existing ML classifiers. A novel approach of frequent label-set mining has been proposed to extract these correlated classes from the label-sets of the data. Both co-presence (CP) and co-absence (CA) of classes have been taken into consideration. The rules mined from the ML data has been further used to incorporate class correlation information into existing ML classifiers. The soft scores generated by an ML classifier are modified through a novel approach using the CP-CA rules. A concept of certain and uncertain scores has been defined here, where the proposed method aims to improve the uncertain scores with the help of the certain scores and their corresponding CP-CA rules. This has been experimentally analysed on ten ML datasets for three ML existing classifiers which shows substantial improvement in their overall performance.
Correspondence-based point cloud registration is a cornerstone in geometric computer vision, robotics perception, photogrammetry and remote sensing, which seeks to estimate the best rigid transformation between two point clouds from the correspondences established over 3D keypoints. However, due to limited robustness and accuracy, current 3D keypoint matching techniques are very prone to yield outliers, probably even in very large numbers, making robust estimation for point cloud registration of great importance. Unfortunately, existing robust methods may suffer from high computational cost or insufficient robustness when encountering high (or even extreme) outlier ratios, hardly ideal enough for practical use. In this paper, we present a novel time-efficient RANSAC-type consensus maximization solver, named DANIEL (Double-layered sAmpliNg with consensus maximization based on stratIfied Element-wise compatibiLity), for robust registration. DANIEL is designed with two layers of random sampling, in order to find inlier subsets with the lowest computational cost possible. Specifically, we: (i) apply the rigidity constraint to prune raw outliers in the first layer of one-point sampling, (ii) introduce a series of stratified element-wise compatibility tests to conduct rapid compatibility checking between minimal models so as to realize more efficient consensus maximization in the second layer of two-point sampling, and (iii) probabilistic termination conditions are employed to ensure the timely return of the final inlier set. Based on a variety of experiments over multiple real datasets, we show that DANIEL is robust against over 99% outliers and also significantly faster than existing state-of-the-art robust solvers (e.g. RANSAC, FGR, GORE).
Clinicians are often very sceptical about applying automatic image processing approaches, especially deep learning based methods, in practice. One main reason for this is the black-box nature of these approaches and the inherent problem of missing insights of the automatically derived decisions. In order to increase trust in these methods, this paper presents approaches that help to interpret and explain the results of deep learning algorithms by depicting the anatomical areas which influence the decision of the algorithm most. Moreover, this research presents a unified framework, TorchEsegeta, for applying various interpretability and explainability techniques for deep learning models and generate visual interpretations and explanations for clinicians to corroborate their clinical findings. In addition, this will aid in gaining confidence in such methods. The framework builds on existing interpretability and explainability techniques that are currently focusing on classification models, extending them to segmentation tasks. In addition, these methods have been adapted to 3D models for volumetric analysis. The proposed framework provides methods to quantitatively compare visual explanations using infidelity and sensitivity metrics. This framework can be used by data scientists to perform post-hoc interpretations and explanations of their models, develop more explainable tools and present the findings to clinicians to increase their faith in such models. The proposed framework was evaluated based on a use case scenario of vessel segmentation models trained on Time-of-fight (TOF) Magnetic Resonance Angiogram (MRA) images of the human brain. Quantitative and qualitative results of a comparative study of different models and interpretability methods are presented. Furthermore, this paper provides an extensive overview of several existing interpretability and explainability methods.
Omni-directional mobile robot (OMR) systems have been very popular in academia and industry for their superb maneuverability and flexibility. Yet their potential has not been fully exploited, where the extra degree of freedom in OMR can potentially enable the robot to carry out extra tasks. For instance, gimbals or sensors on robots may suffer from a limited field of view or be constrained by the inherent mechanical design, which will require the chassis to be orientation-aware and respond in time. To solve this problem and further develop the OMR systems, in this paper, we categorize the tasks related to OMR chassis into orientation transition tasks and position transition tasks, where the two tasks can be carried out at the same time. By integrating the parallel task goals in a single planning problem, we proposed an orientation-aware planning architecture for OMR systems to execute the orientation transition and position transition in a unified and efficient way. A modified trajectory optimization method called orientation-aware timed-elastic-band (OATEB) is introduced to generate the trajectory that satisfies the requirements of both tasks. Experiments in both 2D simulated environments and real scenes are carried out. A four-wheeled OMR is deployed to conduct the real scene experiment and the results demonstrate that the proposed method is capable of simultaneously executing parallel tasks and is applicable to real-life scenarios.
Real-time multi-agent collision-avoidance algorithms comprise a key enabling technology for the practical use of self-organising swarms of drones. This paper proposes a decentralised reciprocal collision-avoidance algorithm, which is based on stigmergy and scalable. The algorithm is computationally inexpensive, based on the gradient of the locally measured dynamic cumulative signal strength field which results from the signals emitted by the swarm. The signal strength acts as a repulsor on each drone, which then tends to steer away from the noisiest regions (cluttered environment), thus avoiding collisions. The magnitudes of these repulsive forces can be tuned to control the relative importance assigned to collision avoidance with respect to the other phenomena affecting the agent's dynamics. We carried out numerical experiments on a self-organising swarm of drones aimed at fighting wildfires autonomously. As expected, it has been found that the collision rate can be reduced either by decreasing the cruise speed of the agents and/or by increasing the sampling frequency of the global signal strength field. A convenient by-product of the proposed collision-avoidance algorithm is that it helps maintain diversity in the swarm, thus enhancing exploration.