Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lili Mou

Why Do Neural Dialog Systems Generate Short and Meaningless Replies? A Comparison between Dialog and Translation

Dec 06, 2017
Bolin Wei, Shuai Lu, Lili Mou, Hao Zhou, Pascal Poupart, Ge Li, Zhi Jin

Figure 1 for Why Do Neural Dialog Systems Generate Short and Meaningless Replies? A Comparison between Dialog and Translation

Figure 2 for Why Do Neural Dialog Systems Generate Short and Meaningless Replies? A Comparison between Dialog and Translation

Figure 3 for Why Do Neural Dialog Systems Generate Short and Meaningless Replies? A Comparison between Dialog and Translation

This paper addresses the question: Why do neural dialog systems generate short and meaningless replies? We conjecture that, in a dialog system, an utterance may have multiple equally plausible replies, causing the deficiency of neural networks in the dialog application. We propose a systematic way to mimic the dialog scenario in a machine translation system, and manage to reproduce the phenomenon of generating short and less meaningful sentences in the translation setting, showing evidence of our conjecture.

Via

Access Paper or Ask Questions

Affective Neural Response Generation

Sep 12, 2017
Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang, Lili Mou

Figure 1 for Affective Neural Response Generation

Figure 2 for Affective Neural Response Generation

Figure 3 for Affective Neural Response Generation

Figure 4 for Affective Neural Response Generation

Existing neural conversational models process natural language primarily on a lexico-syntactic level, thereby ignoring one of the most crucial components of human-to-human dialogue: its affective content. We take a step in this direction by proposing three novel ways to incorporate affective/emotional aspects into long short term memory (LSTM) encoder-decoder neural conversation models: (1) affective word embeddings, which are cognitively engineered, (2) affect-based objective functions that augment the standard cross-entropy loss, and (3) affectively diverse beam search for decoding. Experiments show that these techniques improve the open-domain conversational prowess of encoder-decoder networks by enabling them to produce emotionally rich responses that are more interesting and natural.

* 8 pages

Via

Access Paper or Ask Questions

RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

Jul 16, 2017
Chongyang Tao, Lili Mou, Dongyan Zhao, Rui Yan

Figure 1 for RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

Figure 2 for RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

Figure 3 for RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

Figure 4 for RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

Open-domain human-computer conversation has been attracting increasing attention over the past few years. However, there does not exist a standard automatic evaluation metric for open-domain dialog systems; researchers usually resort to human annotation for model evaluation, which is time- and labor-intensive. In this paper, we propose RUBER, a Referenced metric and Unreferenced metric Blended Evaluation Routine, which evaluates a reply by taking into consideration both a groundtruth reply and a query (previous user-issued utterance). Our metric is learnable, but its training does not require labels of human satisfaction. Hence, RUBER is flexible and extensible to different datasets and languages. Experiments on both retrieval and generative dialog systems show that RUBER has a high correlation with human annotation.

Via

Access Paper or Ask Questions

Coupling Distributed and Symbolic Execution for Natural Language Queries

Jun 16, 2017
Lili Mou, Zhengdong Lu, Hang Li, Zhi Jin

Figure 1 for Coupling Distributed and Symbolic Execution for Natural Language Queries

Figure 2 for Coupling Distributed and Symbolic Execution for Natural Language Queries

Figure 3 for Coupling Distributed and Symbolic Execution for Natural Language Queries

Figure 4 for Coupling Distributed and Symbolic Execution for Natural Language Queries

Building neural networks to query a knowledge base (a table) with natural language is an emerging research topic in deep learning. An executor for table querying typically requires multiple steps of execution because queries may have complicated structures. In previous studies, researchers have developed either fully distributed executors or symbolic executors for table querying. A distributed executor can be trained in an end-to-end fashion, but is weak in terms of execution efficiency and explicit interpretability. A symbolic executor is efficient in execution, but is very difficult to train especially at initial stages. In this paper, we propose to couple distributed and symbolic execution for natural language queries, where the symbolic executor is pretrained with the distributed executor's intermediate execution results in a step-by-step fashion. Experiments show that our approach significantly outperforms both distributed and symbolic executors, exhibiting high accuracy, high learning efficiency, high execution efficiency, and high interpretability.

* Accepted by ICML-17; also presented at ICLR-17 Workshop

Via

Access Paper or Ask Questions

How Transferable are Neural Networks in NLP Applications?

Oct 13, 2016
Lili Mou, Zhao Meng, Rui Yan, Ge Li, Yan Xu, Lu Zhang, Zhi Jin

Figure 1 for How Transferable are Neural Networks in NLP Applications?

Figure 2 for How Transferable are Neural Networks in NLP Applications?

Figure 3 for How Transferable are Neural Networks in NLP Applications?

Figure 4 for How Transferable are Neural Networks in NLP Applications?

Transfer learning is aimed to make use of valuable knowledge in a source domain to help model performance in a target domain. It is particularly important to neural networks, which are very likely to be overfitting. In some fields like image processing, many studies have shown the effectiveness of neural network-based transfer learning. For neural NLP, however, existing studies have only casually applied transfer learning, and conclusions are inconsistent. In this paper, we conduct systematic case studies and provide an illuminating picture on the transferability of neural networks in NLP.

* Accepted by EMNLP-16

Via

Access Paper or Ask Questions

Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation

Oct 13, 2016
Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, Zhi Jin

Figure 1 for Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation

Figure 2 for Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation

Figure 3 for Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation

Figure 4 for Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation

Using neural networks to generate replies in human-computer dialogue systems is attracting increasing attention over the past few years. However, the performance is not satisfactory: the neural network tends to generate safe, universally relevant replies which carry little meaning. In this paper, we propose a content-introducing approach to neural network-based generative dialogue systems. We first use pointwise mutual information (PMI) to predict a noun as a keyword, reflecting the main gist of the reply. We then propose seq2BF, a "sequence to backward and forward sequences" model, which generates a reply containing the given keyword. Experimental results show that our approach significantly outperforms traditional sequence-to-sequence models in terms of human evaluation and the entropy measure, and that the predicted keyword can appear at an appropriate position in the reply.

* Accepted by COLING

Via

Access Paper or Ask Questions

Improved Relation Classification by Deep Recurrent Neural Networks with Data Augmentation

Oct 13, 2016
Yan Xu, Ran Jia, Lili Mou, Ge Li, Yunchuan Chen, Yangyang Lu, Zhi Jin

Figure 1 for Improved Relation Classification by Deep Recurrent Neural Networks with Data Augmentation

Figure 2 for Improved Relation Classification by Deep Recurrent Neural Networks with Data Augmentation

Figure 3 for Improved Relation Classification by Deep Recurrent Neural Networks with Data Augmentation

Figure 4 for Improved Relation Classification by Deep Recurrent Neural Networks with Data Augmentation

Nowadays, neural networks play an important role in the task of relation classification. By designing different neural architectures, researchers have improved the performance to a large extent in comparison with traditional methods. However, existing neural networks for relation classification are usually of shallow architectures (e.g., one-layer convolutional neural networks or recurrent networks). They may fail to explore the potential representation space in different abstraction levels. In this paper, we propose deep recurrent neural networks (DRNNs) for relation classification to tackle this challenge. Further, we propose a data augmentation method by leveraging the directionality of relations. We evaluated our DRNNs on the SemEval-2010 Task~8, and achieve an F1-score of 86.1%, outperforming previous state-of-the-art recorded results.

* Accepted by COLING-16

Via

Access Paper or Ask Questions

Dialogue Session Segmentation by Embedding-Enhanced TextTiling

Oct 13, 2016
Yiping Song, Lili Mou, Rui Yan, Li Yi, Zinan Zhu, Xiaohua Hu, Ming Zhang

Figure 1 for Dialogue Session Segmentation by Embedding-Enhanced TextTiling

Figure 2 for Dialogue Session Segmentation by Embedding-Enhanced TextTiling

Figure 3 for Dialogue Session Segmentation by Embedding-Enhanced TextTiling

Figure 4 for Dialogue Session Segmentation by Embedding-Enhanced TextTiling

In human-computer conversation systems, the context of a user-issued utterance is particularly important because it provides useful background information of the conversation. However, it is unwise to track all previous utterances in the current session as not all of them are equally important. In this paper, we address the problem of session segmentation. We propose an embedding-enhanced TextTiling approach, inspired by the observation that conversation utterances are highly noisy, and that word embeddings provide a robust way of capturing semantics. Experimental results show that our approach achieves better performance than the TextTiling, MMD approaches.

* INTERSPEECH-16, pages 2706--2710

Via

Access Paper or Ask Questions

Compressing Neural Language Models by Sparse Word Representations

Oct 13, 2016
Yunchuan Chen, Lili Mou, Yan Xu, Ge Li, Zhi Jin

Figure 1 for Compressing Neural Language Models by Sparse Word Representations

Figure 2 for Compressing Neural Language Models by Sparse Word Representations

Figure 3 for Compressing Neural Language Models by Sparse Word Representations

Figure 4 for Compressing Neural Language Models by Sparse Word Representations

Neural networks are among the state-of-the-art techniques for language modeling. Existing neural language models typically map discrete words to distributed, dense vector representations. After information processing of the preceding context words by hidden layers, an output layer estimates the probability of the next word. Such approaches are time- and memory-intensive because of the large numbers of parameters for word embeddings and the output layer. In this paper, we propose to compress neural language models by sparse word representations. In the experiments, the number of parameters in our model increases very slowly with the growth of the vocabulary size, which is almost imperceptible. Moreover, our approach not only reduces the parameter space to a large extent, but also improves the performance in terms of the perplexity measure.

* ACL-16, pages 226--235

Via

Access Paper or Ask Questions

Distilling Word Embeddings: An Encoding Approach

Jul 24, 2016
Lili Mou, Ran Jia, Yan Xu, Ge Li, Lu Zhang, Zhi Jin

Figure 1 for Distilling Word Embeddings: An Encoding Approach

Figure 2 for Distilling Word Embeddings: An Encoding Approach

Figure 3 for Distilling Word Embeddings: An Encoding Approach

Figure 4 for Distilling Word Embeddings: An Encoding Approach

Distilling knowledge from a well-trained cumbersome network to a small one has recently become a new research topic, as lightweight neural networks with high performance are particularly in need in various resource-restricted systems. This paper addresses the problem of distilling word embeddings for NLP tasks. We propose an encoding approach to distill task-specific knowledge from a set of high-dimensional embeddings, which can reduce model complexity by a large margin as well as retain high accuracy, showing a good compromise between efficiency and performance. Experiments in two tasks reveal the phenomenon that distilling knowledge from cumbersome embeddings is better than directly training neural networks with small embeddings.

* Accepted by CIKM-16 as a short paper, and by the Representation Learning for Natural Language Processing (RL4NLP) Workshop @ACL-16 for presentation

Via

Access Paper or Ask Questions