While training an end-to-end navigation network in the real world is usually of high cost, simulation provides a safe and cheap environment in this training stage. However, training neural network models in simulation brings up the problem of how to effectively transfer the model from simulation to the real world (sim-to-real). In this work, we regard the environment representation as a crucial element in this transfer process and propose a visual information pyramid (VIP) model to systematically investigate a practical environment representation. A novel representation composed of spatial and semantic information synthesis is then established accordingly, where noise model embedding is particularly considered. To explore the effectiveness of this representation, we compared the performance with representations popularly used in the literature in both simulated and real-world scenarios. Results suggest that our environment representation stands out. Furthermore, an analysis on the feature map is implemented to investigate the effectiveness through inner reaction, which could be irradiative for future researches on end-to-end navigation.
Reinforcement learning has attracted great attention recently, especially policy gradient algorithms, which have been demonstrated on challenging decision making and control tasks. In this paper, we propose an active multi-step TD algorithm with adaptive stepsizes to learn actor and critic. Specifically, our model consists of two components: active stepsize learning and adaptive multi-step TD algorithm. Firstly, we divide the time horizon into chunks and actively select state and action inside each chunk. Then given the selected samples, we propose the adaptive multi-step TD, which generalizes TD($\lambda$), but adaptively switch on/off the backups from future returns of different steps. Particularly, the adaptive multi-step TD introduces a context-aware mechanism, here a binary classifier, which decides whether or not to turn on its future backups based on the context changes. Thus, our model is kind of combination of active learning and multi-step TD algorithm, which has the capacity for learning off-policy without the need of importance sampling. We evaluate our approach on both discrete and continuous space tasks in an off-policy setting respectively, and demonstrate competitive results compared to other reinforcement learning baselines.
While training an end-to-end navigation network in the real world is usually of high cost, simulations provide a safe and cheap environment in this training stage. However, training neural network models in simulations brings up the problem of how to effectively transfer the model from simulations to the real world (sim-to-real). In this work, we regard the environment representation as a crucial element in this transfer process and propose a visual information pyramid (VIP) model to systematically investigate a practical environment representation. A novel representation composed of spatial and semantic information synthesis is then established accordingly. To explore the effectiveness of this representation, we compared the performance with representations popularly used in the literature in both simulated and real-world scenarios. Results suggest that our proposed environment representation behaves best. Furthermore, an analysis on the feature map is implemented to investigate the effectiveness through inner reaction, which could be irradiative for future researches on end-to-end navigation.
Digitized pathological diagnosis has been in increasing demand recently. It is well known that color information is critical to the automatic and visual analysis of pathological slides. However, the color variations due to various factors not only have negative impact on pathologist's diagnosis, but also will reduce the robustness of the algorithms. The factors that cause the color differences are not only in the process of making the slices, but also in the process of digitization. Different strategies have been proposed to alleviate the color variations. Most of such techniques rely on collecting color statistics to perform color matching across images and highly dependent on a reference template slide. Since the pathological slides between hospitals are usually unpaired, these methods do not yield good matching results. In this work, we propose a novel network that we refer to as Transitive Adversarial Networks (TAN) to transfer the color information among slides from different hospitals or centers. It is not necessary for an expert to pick a representative reference slide in the proposed TAN method. We compare the proposed method with the state-of-the-art methods quantitatively and qualitatively. Compared with the state-of-the-art methods, our method yields an improvement of 0.87dB in terms of PSNR, demonstrating the effectiveness of the proposed TAN method in stain style transfer.
Deep reinforcement learning (DRL) is a booming area of artificial intelligence. Many practical applications of DRL naturally involve more than one collaborative learners, making it important to study DRL in a multi-agent context. Previous research showed that effective learning in complex multi-agent systems demands for highly coordinated environment exploration among all the participating agents. Many researchers attempted to cope with this challenge through learning centralized value functions. However, the common strategy for every agent to learn their local policies directly often fail to nurture strong inter-agent collaboration and can be sample inefficient whenever agents alter their communication channels. To address these issues, we propose a new framework known as centralized training and exploration with decentralized execution via policy distillation. Guided by this framework and the maximum-entropy learning technique, we will first train agents' policies with shared global component to foster coordinated and effective learning. Locally executable policies will be derived subsequently from the trained global policies via policy distillation. Experiments show that our new framework and algorithm can achieve significantly better performance and higher sample efficiency than a cutting-edge baseline on several multi-agent DRL benchmarks.
Deep learning has recently become very popular on account of its incredible success in many complex data-driven applications, such as image classification and speech recognition. The database community has worked on data-driven applications for many years, and therefore should be playing a lead role in supporting this new wave. However, databases and deep learning are different in terms of both techniques and applications. In this paper, we discuss research problems at the intersection of the two fields. In particular, we discuss possible improvements for deep learning systems from a database perspective, and analyze database applications that may benefit from deep learning techniques.
Comprehensive quality-aware automated semantic web service composition is an NP-hard problem, where service composition workflows are unknown, and comprehensive quality, i.e., Quality of services (QoS) and Quality of semantic matchmaking (QoSM) are simultaneously optimized. The objective of this problem is to find a solution with optimized or near-optimized overall QoS and QoSM within polynomial time over a service request. In this paper, we proposed novel memetic EDA-based approaches to tackle this problem. The proposed method investigates the effectiveness of several neighborhood structures of composite services by proposing domain-dependent local search operators. Apart from that, a joint strategy of the local search procedure is proposed to integrate with a modified EDA to reduce the overall computation time of our memetic approach. To better demonstrate the effectiveness and scalability of our approach, we create a more challenging, augmented version of the service composition benchmark based on WSC-08 \cite{bansal2008wsc} and WSC-09 \cite{kona2009wsc}. Experimental results on this benchmark show that one of our proposed memetic EDA-based approach (i.e., MEEDA-LOP) significantly outperforms existing state-of-the-art algorithms.
Machine-learning-based data-driven applications have become ubiquitous, e.g., health-care analysis and database system optimization. Big training data and large (deep) models are crucial for good performance. Dropout has been widely used as an efficient regularization technique to prevent large models from overfitting. However, many recent works show that dropout does not bring much performance improvement for deep convolutional neural networks (CNNs), a popular deep learning model for data-driven applications. In this paper, we formulate existing dropout methods for CNNs under the same analysis framework to investigate the failures. We attribute the failure to the conflicts between the dropout and the batch normalization operation after it. Consequently, we propose to change the order of the operations, which results in new building blocks of CNNs.Extensive experiments on benchmark datasets CIFAR, SVHN and ImageNet have been conducted to compare the existing building blocks and our new building blocks with different dropout methods. The results confirm the superiority of our proposed building blocks due to the regularization and implicit model ensemble effect of dropout. In particular, we improve over state-of-the-art CNNs with significantly better performance of 3.17%, 16.15%, 1.44%, 21.46% error rate on CIFAR-10, CIFAR-100, SVHN and ImageNet respectively.