Recent trends focusing on Industry 4.0 concept and smart manufacturing arise a data-driven fault diagnosis as key topic in condition-based maintenance. Fault diagnosis is considered as an essential task in rotary machinery since possibility of an early detection and diagnosis of the faulty condition can save both time and money. Traditional data-driven techniques of fault diagnosis require signal processing for feature extraction, as they are unable to work with raw signal data, consequently leading to need for expert knowledge and human work. The emergence of deep learning architectures in condition-based maintenance promises to ensure high performance fault diagnosis while lowering necessity for expert knowledge and human work. This paper presents developed technique for deep learning-based data-driven fault diagnosis of rotary machinery. The proposed technique input raw three axis accelerometer signal as high-definition image into deep learning layers which automatically extract signal features, enabling high classification accuracy.
How to obtain a model with good interpretability and performance has always been an important research topic. In this paper, we propose rectified decision trees (ReDT), a knowledge distillation based decision trees rectification with high interpretability, small model size, and empirical soundness. Specifically, we extend the impurity calculation and the pure ending condition of the classical decision tree to propose a decision tree extension that allows the use of soft labels generated by a well-trained teacher model in training and prediction process. It is worth noting that for the acquisition of soft labels, we propose a new multiple cross-validation based method to reduce the effects of randomness and overfitting. These approaches ensure that ReDT retains excellent interpretability and even achieves fewer nodes than the decision tree in the aspect of compression while having relatively good performance. Besides, in contrast to traditional knowledge distillation, back propagation of the student model is not necessarily required in ReDT, which is an attempt of a new knowledge distillation approach. Extensive experiments are conducted, which demonstrates the superiority of ReDT in interpretability, compression, and empirical soundness.
Recent advancements in the perception for autonomous driving are driven by deep learning. In order to achieve the robust and accurate scene understanding, autonomous vehicles are usually equipped with different sensors (e.g. cameras, LiDARs, Radars), and multiple sensing modalities can be fused to exploit their complementary properties. In this context, many methods have been proposed for deep multi-modal perception problems. However, there is no general guideline for network architecture design, and questions of "what to fuse", "when to fuse", and "how to fuse" remain open. This review paper attempts to systematically summarize methodologies and discuss challenges for deep multi-modal object detection and semantic segmentation in autonomous driving. To this end, we first provide an overview of on-board sensors on test vehicles, open datasets and the background information of object detection and semantic segmentation for the autonomous driving research. We then summarize the fusion methodologies and discuss challenges and open questions. In the appendix, we provide tables that summarize topics and methods. We also provide an interactive online platform to navigate each reference: https://multimodalperception.github.io.
The past few decades have witnessed the great progress of unmanned aircraft vehicles (UAVs) in civilian fields, especially in photogrammetry and remote sensing. In contrast with the platforms of manned aircraft and satellite, the UAV platform holds many promising characteristics: flexibility, efficiency, high-spatial/temporal resolution, low cost, easy operation, etc., which make it an effective complement to other remote-sensing platforms and a cost-effective means for remote sensing. Considering the popularity and expansion of UAV-based remote sensing in recent years, this paper provides a systematic survey on the recent advances and future prospectives of UAVs in the remote-sensing community. Specifically, the main challenges and key technologies of remote-sensing data processing based on UAVs are discussed and summarized firstly. Then, we provide an overview of the widespread applications of UAVs in remote sensing. Finally, some prospects for future work are discussed. We hope this paper will provide remote-sensing researchers an overall picture of recent UAV-based remote sensing developments and help guide the further research on this topic.
Scene graph generation refers to the task of automatically mapping an image into a semantic structural graph, which requires correctly labeling each extracted objects and their interaction relationships. Despite the recent successes in object detection using deep learning techniques, inferring complex contextual relationships and structured graph representations from visual data remains a challenging topic. In this study, we propose a novel Attentive Relational Network that consists of two key modules with an object detection backbone to approach this problem. The first module is a semantic transformation module used to capture semantic embedded relation features, by translating visual features and linguistic features into a common semantic space. The other module is a graph self-attention module introduced to embed a joint graph representation through assigning various importance weights to neighboring nodes. Finally, accurate scene graphs are produced with the relation inference module by recognizing all entities and the corresponding relations. We evaluate our proposed method on the widely-adopted Visual Genome Dataset, and the results demonstrate the effectiveness and superiority of our model.
Current conversational systems can follow simple commands and answer basic questions, but they have difficulty maintaining coherent and open-ended conversations about specific topics. Competitions like the Conversational Intelligence (ConvAI) challenge are being organized to push the research development towards that goal. This article presents in detail the RLLChatbot that participated in the 2017 ConvAI challenge. The goal of this research is to better understand how current deep learning and reinforcement learning tools can be used to build a robust yet flexible open domain conversational agent. We provide a thorough description of how a dialog system can be built and trained from mostly public-domain datasets using an ensemble model. The first contribution of this work is a detailed description and analysis of different text generation models in addition to novel message ranking and selection methods. Moreover, a new open-source conversational dataset is presented. Training on this data significantly improves the [email protected] score of the ranking and selection mechanisms compared to our baseline model responsible for selecting the message returned at each interaction.
Automatic abstractive text summarization is an important and challenging research topic of natural language processing. Among many widely used languages, the Chinese language has a special property that a Chinese character contains rich information comparable to a word. Existing Chinese text summarization methods, either adopt totally character-based or word-based representations, fail to fully exploit the information carried by both representations. To accurately capture the essence of articles, we propose a hybrid word-character approach (HWC) which preserves the advantages of both word-based and character-based representations. We evaluate the advantage of the proposed HWC approach by applying it to two existing methods, and discover that it generates state-of-the-art performance with a margin of 24 ROUGE points on a widely used dataset LCSTS. In addition, we find an issue contained in the LCSTS dataset and offer a script to remove overlapping pairs (a summary and a short text) to create a clean dataset for the community. The proposed HWC approach also generates the best performance on the new, clean LCSTS dataset.
In online question-and-answer (QA) websites like Quora, one central issue is to find (invite) users who are able to provide answers to a given question and at the same time would be unlikely to say "no" to the invitation. The challenge is how to trade off the matching degree between users' expertise and the question topic, and the likelihood of positive response from the invited users. In this paper, we formally formulate the problem and develop a weakly supervised factor graph (WeakFG) model to address the problem. The model explicitly captures expertise matching degree between questions and users. To model the likelihood that an invited user is willing to answer a specific question, we incorporate a set of correlations based on social identity theory into the WeakFG model. We use two different genres of datasets: QA-Expert and Paper-Reviewer, to validate the proposed model. Our experimental results show that the proposed model can significantly outperform (+1.5-10.7% by MAP) the state-of-the-art algorithms for matching users (experts) with community questions. We have also developed an online system to further demonstrate the advantages of the proposed method.
Artificial neural networks have recently shown great results in many disciplines and a variety of applications, including natural language understanding, speech processing, games and image data generation. One particular application in which the strong performance of artificial neural networks was demonstrated is the recognition of objects in images, where deep convolutional neural networks are commonly applied. In this survey, we give a comprehensive introduction to this topic (object recognition with deep convolutional neural networks), with a strong focus on the evolution of network architectures. Therefore, we aim to compress the most important concepts in this field in a simple and non-technical manner to allow for future researchers to have a quick general understanding. This work is structured as follows: 1. We will explain the basic ideas of (convolutional) neural networks and deep learning and examine their usage for three object recognition tasks: image classification, object localization and object detection. 2. We give a review on the evolution of deep convolutional neural networks by providing an extensive overview of the most important network architectures presented in chronological order of their appearances.
The stochastic variational inference (SVI) paradigm, which combines variational inference, natural gradients, and stochastic updates, was recently proposed for large-scale data analysis in conjugate Bayesian models and demonstrated to be effective in several problems. This paper studies a family of Bayesian latent variable models with two levels of hidden variables but without any conjugacy requirements, making several contributions in this context. The first is observing that SVI, with an improved structured variational approximation, is applicable under more general conditions than previously thought with the only requirement being that the approximating variational distribution be in the same family as the prior. The resulting approach, Monte Carlo Structured SVI (MC-SSVI), significantly extends the scope of SVI, enabling large-scale learning in non-conjugate models. For models with latent Gaussian variables we propose a hybrid algorithm, using both standard and natural gradients, which is shown to improve stability and convergence. Applications in mixed effects models, sparse Gaussian processes, probabilistic matrix factorization and correlated topic models demonstrate the generality of the approach and the advantages of the proposed algorithms.