End-to-end speech translation, a hot topic in recent years, aims to translate a segment of audio into a specific language with an end-to-end model. Conventional approaches employ multi-task learning and pre-training methods for this task, but they suffer from the huge gap between pre-training and fine-tuning. To address these issues, we propose a Tandem Connectionist Encoding Network (TCEN) which bridges the gap by reusing all subnets in fine-tuning, keeping the roles of subnets consistent, and pre-training the attention module. Furthermore, we propose two simple but effective methods to guarantee the speech encoder outputs and the MT encoder inputs are consistent in terms of semantic representation and sequence length. Experimental results show that our model outperforms baselines 2.2 BLEU on a large benchmark dataset.
Linear mixture models have proven very useful in a plethora of applications, e.g., topic modeling, clustering, and source separation. As a critical aspect of the linear mixture models, identifiability of the model parameters is well-studied, under frameworks such as independent component analysis and constrained matrix factorization. Nevertheless, when the linear mixtures are distorted by an unknown nonlinear functions -- which is well-motivated and more realistic in many cases -- the identifiability issues are much less studied. This work proposes an identification criterion for a nonlinear mixture model that is well grounded in many real-world applications, and offers identifiability guarantees. A practical implementation based on a judiciously designed neural network is proposed to realize the criterion, and an effective learning algorithm is proposed. Numerical results on synthetic and real-data corroborate effectiveness of the proposed method.
The topic of aspect-based sentiment analysis (ABSA) has been explored for a variety of industries, but it still remains much unexplored in finance. The recent release of data for an open challenge (FiQA) from the companion proceedings of WWW '18 has provided valuable finance-specific annotations. FiQA contains high quality labels, but it still lacks data quantity to apply traditional ABSA deep learning architecture. In this paper, we employ high-level semantic representations and methods of inductive transfer learning for NLP. We experiment with extensions of recently developed domain adaptation methods and target task fine-tuning that significantly improve performance on a small dataset. Our results show an 8.7% improvement in the F1 score for classification and an 11% improvement over the MSE for regression on current state-of-the-art results.
Most generative document models act on bag-of-words input in an attempt to focus on the semantic content and thereby partially forego syntactic information. We argue that it is preferable to keep the original word order intact and explicitly account for the syntactic structure instead. We propose an extension to the Neural Variational Document Model (Miao et al., 2016) that does exactly that to separate local (syntactic) context from the global (semantic) representation of the document. Our model builds on the variational autoencoder framework to define a generative document model based on next-word prediction. We name our approach Sequence-Aware Variational Autoencoder since in contrast to its predecessor, it operates on the true input sequence. In a series of experiments we observe stronger topicality of the learned representations as well as increased robustness to syntactic noise in our training data.
This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.
Reinforcement learning (RL), while often powerful, can suffer from slow learning speeds, particularly in high dimensional spaces. The autonomous decomposition of tasks and use of hierarchical methods hold the potential to significantly speed up learning in such domains. This paper proposes a novel practical method that can autonomously decompose tasks, by leveraging association rule mining, which discovers hidden relationship among entities in data mining. We introduce a novel method called ARM-HSTRL (Association Rule Mining to extract Hierarchical Structure of Tasks in Reinforcement Learning). It extracts temporal and structural relationships of sub-goals in RL, and multi-task RL. In particular,it finds sub-goals and relationship among them. It is shown the significant efficiency and performance of the proposed method in two main topics of RL.
The practice of evidence-based medicine (EBM) urges medical practitioners to utilise the latest research evidence when making clinical decisions. Because of the massive and growing volume of published research on various medical topics, practitioners often find themselves overloaded with information. As such, natural language processing research has recently commenced exploring techniques for performing medical domain-specific automated text summarisation (ATS) techniques-- targeted towards the task of condensing large medical texts. However, the development of effective summarisation techniques for this task requires cross-domain knowledge. We present a survey of EBM, the domain-specific needs for EBM, automated summarisation techniques, and how they have been applied hitherto. We envision that this survey will serve as a first resource for the development of future operational text summarisation techniques for EBM.
The paradigm shift from shallow classifiers with hand-crafted features to end-to-end trainable deep learning models has shown significant improvements on supervised learning tasks. Despite the promising power of deep neural networks (DNN), how to alleviate overfitting during training has been a research topic of interest. In this paper, we present a Generative-Discriminative Variational Model (GDVM) for visual classification, in which we introduce a latent variable inferred from inputs for exhibiting generative abilities towards prediction. In other words, our GDVM casts the supervised learning task as a generative learning process, with data discrimination to be jointly exploited for improved classification. In our experiments, we consider the tasks of multi-class classification, multi-label classification, and zero-shot learning. We show that our GDVM performs favorably against the baselines or recent generative DNN models.
As an emerging research topic, online class imbalance learning often combines the challenges of both class imbalance and concept drift. It deals with data streams having very skewed class distributions, where concept drift may occur. It has recently received increased research attention; however, very little work addresses the combined problem where both class imbalance and concept drift coexist. As the first systematic study of handling concept drift in class-imbalanced data streams, this paper first provides a comprehensive review of current research progress in this field, including current research focuses and open challenges. Then, an in-depth experimental study is performed, with the goal of understanding how to best overcome concept drift in online learning with class imbalance. Based on the analysis, a general guideline is proposed for the development of an effective algorithm.