Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhaojiang Lin

XPersona: Evaluating Multilingual Personalized Chatbot

Mar 17, 2020
Zhaojiang Lin, Zihan Liu, Genta Indra Winata, Samuel Cahyawijaya, Andrea Madotto, Yejin Bang, Etsuko Ishii, Pascale Fung

Figure 1 for XPersona: Evaluating Multilingual Personalized Chatbot

Figure 2 for XPersona: Evaluating Multilingual Personalized Chatbot

Figure 3 for XPersona: Evaluating Multilingual Personalized Chatbot

Figure 4 for XPersona: Evaluating Multilingual Personalized Chatbot

Personalized dialogue systems are an essential step toward better human-machine interaction. Existing personalized dialogue agents rely on properly designed conversational datasets, which are mostly monolingual (e.g., English), which greatly limits the usage of conversational agents in other languages. In this paper, we propose a multi-lingual extension of Persona-Chat, namely XPersona. Our dataset includes persona conversations in six different languages other than English for building and evaluating multilingual personalized agents. We experiment with both multilingual and cross-lingual trained baselines, and evaluate them against monolingual and translation-pipeline models using both automatic and human evaluation. Experimental results show that the multilingual trained models outperform the translation-pipeline and that they are on par with the monolingual models, with the advantage of having a single model across multiple languages. On the other hand, the state-of-the-art cross-lingual trained models achieve inferior performance to the other models, showing that cross-lingual conversation modeling is a challenging task. We hope that our dataset and baselines~\footnote{Datasets and all the baselines will be released} will accelerate research in multilingual dialogue systems.

* Preprint, 23 pages

Via

Access Paper or Ask Questions

Attention over Parameters for Dialogue Systems

Jan 07, 2020
Andrea Madotto, Zhaojiang Lin, Chien-Sheng Wu, Jamin Shin, Pascale Fung

Figure 1 for Attention over Parameters for Dialogue Systems

Figure 2 for Attention over Parameters for Dialogue Systems

Figure 3 for Attention over Parameters for Dialogue Systems

Figure 4 for Attention over Parameters for Dialogue Systems

Dialogue systems require a great deal of different but complementary expertise to assist, inform, and entertain humans. For example, different domains (e.g., restaurant reservation, train ticket booking) of goal-oriented dialogue systems can be viewed as different skills, and so does ordinary chatting abilities of chit-chat dialogue systems. In this paper, we propose to learn a dialogue system that independently parameterizes different dialogue skills, and learns to select and combine each of them through Attention over Parameters (AoP). The experimental results show that this approach achieves competitive performance on a combined dataset of MultiWOZ, In-Car Assistant, and Persona-Chat. Finally, we demonstrate that each dialogue skill is effectively learned and can be combined with other skills to produce selective responses.

* NeurIPS Conversational AI Workshops (Best Paper Award)

Via

Access Paper or Ask Questions

Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems

Nov 21, 2019
Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung

Figure 1 for Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems

Figure 2 for Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems

Figure 3 for Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems

Figure 4 for Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems

Recently, data-driven task-oriented dialogue systems have achieved promising performance in English. However, developing dialogue systems that support low-resource languages remains a long-standing challenge due to the absence of high-quality data. In order to circumvent the expensive and time-consuming data collection, we introduce Attention-Informed Mixed-Language Training (MLT), a novel zero-shot adaptation method for cross-lingual task-oriented dialogue systems. It leverages very few task-related parallel word pairs to generate code-switching sentences for learning the inter-lingual semantics across languages. Instead of manually selecting the word pairs, we propose to extract source words based on the scores computed by the attention layer of a trained English task-related model and then generate word pairs using existing bilingual dictionaries. Furthermore, intensive experiments with different cross-lingual embeddings demonstrate the effectiveness of our approach. Finally, with very few word pairs, our model achieves significant zero-shot adaptation performance improvements in both cross-lingual dialogue state tracking and natural language understanding (i.e., intent detection and slot filling) tasks compared to the current state-of-the-art approaches, which utilize a much larger amount of bilingual data.

* Accepted as an oral presentation in AAAI 2020

Via

Access Paper or Ask Questions

Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer

Oct 30, 2019
Genta Indra Winata, Samuel Cahyawijaya, Zhaojiang Lin, Zihan Liu, Pascale Fung

Figure 1 for Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer

Figure 2 for Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer

Figure 3 for Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer

Figure 4 for Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer

High performing deep neural networks come at the cost of computational complexity that limits its practicality for deployment on portable devices. We propose Low-Rank Transformer (LRT), a memory-efficient and fast neural architecture that significantly reduces the parameters and boosts the speed in training and inference for end-to-end speech recognition. Our approach reduces the number of parameters of the network by more than 50% parameters and speed-up the inference time by around 1.26x compared to the baseline transformer model. The experiments show that LRT models generalize better and yield lower error rates on both validation and test sets compared to the uncompressed transformer model. LRT models outperform existing works on several datasets in an end-to-end setting without using any external language model and acoustic data.

Via

Access Paper or Ask Questions

Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition

Sep 18, 2019
Genta Indra Winata, Zhaojiang Lin, Jamin Shin, Zihan Liu, Pascale Fung

Figure 1 for Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition

Figure 2 for Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition

Figure 3 for Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition

Figure 4 for Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition

In countries that speak multiple main languages, mixing up different languages within a conversation is commonly called code-switching. Previous works addressing this challenge mainly focused on word-level aspects such as word embeddings. However, in many cases, languages share common subwords, especially for closely related languages, but also for languages that are seemingly irrelevant. Therefore, we propose Hierarchical Meta-Embeddings (HME) that learn to combine multiple monolingual word-level and subword-level embeddings to create language-agnostic lexical representations. On the task of Named Entity Recognition for English-Spanish code-switching data, our model achieves the state-of-the-art performance in the multilingual settings. We also show that, in cross-lingual settings, our model not only leverages closely related languages, but also learns from languages with different roots. Finally, we show that combining different subunits are crucial for capturing code-switching entities.

* Accepted by EMNLP 2019

Via

Access Paper or Ask Questions

MoEL: Mixture of Empathetic Listeners

Aug 21, 2019
Zhaojiang Lin, Andrea Madotto, Jamin Shin, Peng Xu, Pascale Fung

Previous research on empathetic dialogue systems has mostly focused on generating responses given certain emotions. However, being empathetic not only requires the ability of generating emotional responses, but more importantly, requires the understanding of user emotions and replying appropriately. In this paper, we propose a novel end-to-end approach for modeling empathy in dialogue systems: Mixture of Empathetic Listeners (MoEL). Our model first captures the user emotions and outputs an emotion distribution. Based on this, MoEL will softly combine the output states of the appropriate Listener(s), which are each optimized to react to certain emotions, and generate an empathetic response. Human evaluations on empathetic-dialogues (Rashkin et al., 2018) dataset confirm that MoEL outperforms multitask training baseline in terms of empathy, relevance, and fluency. Furthermore, the case study on generated responses of different Listeners shows high interpretability of our model.

* Accepted by EMNLP2019

Via

Access Paper or Ask Questions

Getting To Know You: User Attribute Extraction from Dialogues

Aug 13, 2019
Chien-Sheng Wu, Andrea Madotto, Zhaojiang Lin, Peng Xu, Pascale Fung

Figure 1 for Getting To Know You: User Attribute Extraction from Dialogues

Figure 2 for Getting To Know You: User Attribute Extraction from Dialogues

Figure 3 for Getting To Know You: User Attribute Extraction from Dialogues

Figure 4 for Getting To Know You: User Attribute Extraction from Dialogues

User attributes provide rich and useful information for user understanding, yet structured and easy-to-use attributes are often sparsely populated. In this paper, we leverage dialogues with conversational agents, which contain strong suggestions of user information, to automatically extract user attributes. Since no existing dataset is available for this purpose, we apply distant supervision to train our proposed two-stage attribute extractor, which surpasses several retrieval and generation baselines on human evaluation. Meanwhile, we discuss potential applications (e.g., personalized recommendation and dialogue systems) of such extracted user attributes, and point out current limitations to cast light on future work.

* 1st Workshop on NLP for Conversational AI @ ACL 2019

Via

Access Paper or Ask Questions

CAiRE: An End-to-End Empathetic Chatbot

Jul 28, 2019
Zhaojiang Lin, Peng Xu, Genta Indra Winata, Zihan Liu, Pascale Fung

Figure 1 for CAiRE: An End-to-End Empathetic Chatbot

Figure 2 for CAiRE: An End-to-End Empathetic Chatbot

Figure 3 for CAiRE: An End-to-End Empathetic Chatbot

Figure 4 for CAiRE: An End-to-End Empathetic Chatbot

In this paper, we present an end-to-end empathetic conversation agent CAiRE. Our system adapts TransferTransfo (Wolf et al., 2019) learning approach that fine-tunes a large-scale pre-trained language model with multi-task objectives: response language modeling, response prediction and dialogue emotion detection. We evaluate our model on the recently proposed empathetic-dialogues dataset (Rashkin et al., 2019), the experiment results show that CAiRE achieves state-of-the-art performance on dialogue emotion detection and empathetic response generation.

Via

Access Paper or Ask Questions

CAiRE_HKUST at SemEval-2019 Task 3: Hierarchical Attention for Dialogue Emotion Classification

Jun 10, 2019
Genta Indra Winata, Andrea Madotto, Zhaojiang Lin, Jamin Shin, Yan Xu, Peng Xu, Pascale Fung

Figure 1 for CAiRE_HKUST at SemEval-2019 Task 3: Hierarchical Attention for Dialogue Emotion Classification

Figure 2 for CAiRE_HKUST at SemEval-2019 Task 3: Hierarchical Attention for Dialogue Emotion Classification

Figure 3 for CAiRE_HKUST at SemEval-2019 Task 3: Hierarchical Attention for Dialogue Emotion Classification

Figure 4 for CAiRE_HKUST at SemEval-2019 Task 3: Hierarchical Attention for Dialogue Emotion Classification

Detecting emotion from dialogue is a challenge that has not yet been extensively surveyed. One could consider the emotion of each dialogue turn to be independent, but in this paper, we introduce a hierarchical approach to classify emotion, hypothesizing that the current emotional state depends on previous latent emotions. We benchmark several feature-based classifiers using pre-trained word and emotion embeddings, state-of-the-art end-to-end neural network models, and Gaussian processes for automatic hyper-parameter search. In our experiments, hierarchical architectures consistently give significant improvements, and our best model achieves a 76.77% F1-score on the test set.

* Accepted by SemEval 2019

Via

Access Paper or Ask Questions

Personalizing Dialogue Agents via Meta-Learning

May 24, 2019
Zhaojiang Lin, Andrea Madotto, Chien-Sheng Wu, Pascale Fung

Figure 1 for Personalizing Dialogue Agents via Meta-Learning

Figure 2 for Personalizing Dialogue Agents via Meta-Learning

Figure 3 for Personalizing Dialogue Agents via Meta-Learning

Existing personalized dialogue models use human designed persona descriptions to improve dialogue consistency. Collecting such descriptions from existing dialogues is expensive and requires hand-crafted feature designs. In this paper, we propose to extend Model-Agnostic Meta-Learning (MAML)(Finn et al., 2017) to personalized dialogue learning without using any persona descriptions. Our model learns to quickly adapt to new personas by leveraging only a few dialogue samples collected from the same user, which is fundamentally different from conditioning the response on the persona descriptions. Empirical results on Persona-chat dataset (Zhang et al., 2018) indicate that our solution outperforms non-meta-learning baselines using automatic evaluation metrics, and in terms of human-evaluated fluency and consistency.

* Accepted in ACL 2019. Zhaojiang Lin* and Andrea Madotto* contributed equally to this work

Via

Access Paper or Ask Questions