Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhou Yu

HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations

Jun 01, 2021
Weixin Liang, Kai-Hui Liang, Zhou Yu

Figure 1 for HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations

Figure 2 for HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations

Figure 3 for HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations

Figure 4 for HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations

Open-domain dialog systems have a user-centric goal: to provide humans with an engaging conversation experience. User engagement is one of the most important metrics for evaluating open-domain dialog systems, and could also be used as real-time feedback to benefit dialog policy learning. Existing work on detecting user disengagement typically requires hand-labeling many dialog samples. We propose HERALD, an annotation efficient framework that reframes the training data annotation process as a denoising problem. Specifically, instead of manual labeling training samples, we first use a set of labeling heuristics to automatically label training samples. We then denoise the weakly labeled data using Shapley algorithm. Finally, we use the denoised data to train a user engagement detector. Our experiments show that HERALD improves annotation efficiency significantly and achieves 86% user disengagement detection accuracy in two dialog corpora.

* ACL 2021. Code & data available at https://github.com/Weixin-Liang/HERALD/

Via

Access Paper or Ask Questions

Annotation Inconsistency and Entity Bias in MultiWOZ

May 29, 2021
Kun Qian, Ahmad Beirami, Zhouhan Lin, Ankita De, Alborz Geramifard, Zhou Yu, Chinnadhurai Sankar

Figure 1 for Annotation Inconsistency and Entity Bias in MultiWOZ

Figure 2 for Annotation Inconsistency and Entity Bias in MultiWOZ

Figure 3 for Annotation Inconsistency and Entity Bias in MultiWOZ

Figure 4 for Annotation Inconsistency and Entity Bias in MultiWOZ

MultiWOZ is one of the most popular multi-domain task-oriented dialog datasets, containing 10K+ annotated dialogs covering eight domains. It has been widely accepted as a benchmark for various dialog tasks, e.g., dialog state tracking (DST), natural language generation (NLG), and end-to-end (E2E) dialog modeling. In this work, we identify an overlooked issue with dialog state annotation inconsistencies in the dataset, where a slot type is tagged inconsistently across similar dialogs leading to confusion for DST modeling. We propose an automated correction for this issue, which is present in a whopping 70% of the dialogs. Additionally, we notice that there is significant entity bias in the dataset (e.g., "cambridge" appears in 50% of the destination cities in the train domain). The entity bias can potentially lead to named entity memorization in generative models, which may go unnoticed as the test set suffers from a similar entity bias as well. We release a new test set with all entities replaced with unseen entities. Finally, we benchmark joint goal accuracy (JGA) of the state-of-the-art DST baselines on these modified versions of the data. Our experiments show that the annotation inconsistency corrections lead to 7-10% improvement in JGA. On the other hand, we observe a 29% drop in JGA when models are evaluated on the new test set with unseen entities.

* Accepted by SIGDIAL 2021

Via

Access Paper or Ask Questions

Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

May 10, 2021
Zhaojiang Lin, Bing Liu, Seungwhan Moon, Paul Crook, Zhenpeng Zhou, Zhiguang Wang, Zhou Yu, Andrea Madotto, Eunjoon Cho, Rajen Subba

Figure 1 for Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

Figure 2 for Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

Figure 3 for Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

Figure 4 for Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

Zero-shot cross-domain dialogue state tracking (DST) enables us to handle task-oriented dialogue in unseen domains without the expense of collecting in-domain data. In this paper, we propose a slot description enhanced generative approach for zero-shot cross-domain DST. Specifically, our model first encodes dialogue context and slots with a pre-trained self-attentive encoder, and generates slot values in an auto-regressive manner. In addition, we incorporate Slot Type Informed Descriptions that capture the shared information across slots to facilitate cross-domain knowledge transfer. Experimental results on the MultiWOZ dataset show that our proposed method significantly improves existing state-of-the-art results in the zero-shot cross-domain setting.

* NAACL 2021

Via

Access Paper or Ask Questions

LEGOEval: An Open-Source Toolkit for Dialogue System Evaluation via Crowdsourcing

May 05, 2021
Yu Li, Josh Arnold, Feifan Yan, Weiyan Shi, Zhou Yu

Figure 1 for LEGOEval: An Open-Source Toolkit for Dialogue System Evaluation via Crowdsourcing

Figure 2 for LEGOEval: An Open-Source Toolkit for Dialogue System Evaluation via Crowdsourcing

Figure 3 for LEGOEval: An Open-Source Toolkit for Dialogue System Evaluation via Crowdsourcing

Figure 4 for LEGOEval: An Open-Source Toolkit for Dialogue System Evaluation via Crowdsourcing

We present LEGOEval, an open-source toolkit that enables researchers to easily evaluate dialogue systems in a few lines of code using the online crowdsource platform, Amazon Mechanical Turk. Compared to existing toolkits, LEGOEval features a flexible task design by providing a Python API that maps to commonly used React.js interface components. Researchers can personalize their evaluation procedures easily with our built-in pages as if playing with LEGO blocks. Thus, LEGOEval provides a fast, consistent method for reproducing human evaluation results. Besides the flexible task design, LEGOEval also offers an easy API to review collected data.

Via

Access Paper or Ask Questions

Revealing Persona Biases in Dialogue Systems

Apr 18, 2021
Emily Sheng, Josh Arnold, Zhou Yu, Kai-Wei Chang, Nanyun Peng

Figure 1 for Revealing Persona Biases in Dialogue Systems

Figure 2 for Revealing Persona Biases in Dialogue Systems

Figure 3 for Revealing Persona Biases in Dialogue Systems

Figure 4 for Revealing Persona Biases in Dialogue Systems

Dialogue systems in the form of chatbots and personal assistants are being increasingly integrated into people's lives. These dialogue systems often have the ability to adopt an anthropomorphic persona, mimicking a societal demographic to appear more approachable and trustworthy to users. However, the adoption of a persona can result in the adoption of biases. We define persona biases as harmful differences in text (e.g., varying levels of offensiveness or affirmations of biased statements) generated from adopting different demographic personas. In this paper, we present the first large-scale study on persona biases in dialogue systems and conduct analyses on personas of different social classes, sexual orientations, races, and genders. Furthermore, we introduce an open-source framework, UnitPersonaBias, a tool to explore and aggregate subtle persona biases in dialogue systems. In our studies of the Blender and DialoGPT dialogue systems, we show that the choice of personas can affect the degree of harms in generated responses. Additionally, adopting personas of more diverse, historically marginalized demographics appears to decrease harmful responses the most.

* 9 pages

Via

Access Paper or Ask Questions

DEUX: An Attribute-Guided Framework for Sociable Recommendation Dialog Systems

Apr 16, 2021
Yu Li, Shirley Anugrah Hayati, Weiyan Shi, Zhou Yu

Figure 1 for DEUX: An Attribute-Guided Framework for Sociable Recommendation Dialog Systems

Figure 2 for DEUX: An Attribute-Guided Framework for Sociable Recommendation Dialog Systems

Figure 3 for DEUX: An Attribute-Guided Framework for Sociable Recommendation Dialog Systems

Figure 4 for DEUX: An Attribute-Guided Framework for Sociable Recommendation Dialog Systems

It is important for sociable recommendation dialog systems to perform as both on-task content and social content to engage users and gain their favor. In addition to understand the user preferences and provide a satisfying recommendation, such systems must be able to generate coherent and natural social conversations to the user. Traditional dialog state tracking cannot be applied to such systems because it does not track the attributes in the social content. To address this challenge, we propose DEUX, a novel attribute-guided framework to create better user experiences while accomplishing a movie recommendation task. DEUX has a module that keeps track of the movie attributes (e.g., favorite genres, actors,etc.) in both user utterances and system responses. This allows the system to introduce new movie attributes in its social content. Then, DEUX has multiple values for the same attribute type which suits the recommendation task since a user may like multiple genres, for instance. Experiments suggest that DEUX outperforms all the baselines on being more consistent, fitting the user preferences better, and providing a more engaging chat experience. Our approach can be used for any similar problems of sociable task-oriented dialog system.

Via

Access Paper or Ask Questions

A Student-Teacher Architecture for Dialog Domain Adaptation under the Meta-Learning Setting

Apr 06, 2021
Kun Qian, Wei Wei, Zhou Yu

Figure 1 for A Student-Teacher Architecture for Dialog Domain Adaptation under the Meta-Learning Setting

Figure 2 for A Student-Teacher Architecture for Dialog Domain Adaptation under the Meta-Learning Setting

Figure 3 for A Student-Teacher Architecture for Dialog Domain Adaptation under the Meta-Learning Setting

Figure 4 for A Student-Teacher Architecture for Dialog Domain Adaptation under the Meta-Learning Setting

Numerous new dialog domains are being created every day while collecting data for these domains is extremely costly since it involves human interactions. Therefore, it is essential to develop algorithms that can adapt to different domains efficiently when building data-driven dialog models. The most recent researches on domain adaption focus on giving the model a better initialization, rather than optimizing the adaptation process. We propose an efficient domain adaptive task-oriented dialog system model, which incorporates a meta-teacher model to emphasize the different impacts between generated tokens with respect to the context. We first train our base dialog model and meta-teacher model adversarially in a meta-learning setting on rich-resource domains. The meta-teacher learns to quantify the importance of tokens under different contexts across different domains. During adaptation, the meta-teacher guides the dialog model to focus on important tokens in order to achieve better adaptation efficiency. We evaluate our model on two multi-domain datasets, MultiWOZ and Google Schema-Guided Dialogue, and achieve state-of-the-art performance.

* Accepted by AAAI 2021

Via

Access Paper or Ask Questions

Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems

Apr 01, 2021
Derek Chen, Howard Chen, Yi Yang, Alex Lin, Zhou Yu

Figure 1 for Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems

Figure 2 for Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems

Figure 3 for Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems

Figure 4 for Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems

Existing goal-oriented dialogue datasets focus mainly on identifying slots and values. However, customer support interactions in reality often involve agents following multi-step procedures derived from explicitly-defined company policies as well. To study customer service dialogue systems in more realistic settings, we introduce the Action-Based Conversations Dataset (ABCD), a fully-labeled dataset with over 10K human-to-human dialogues containing 55 distinct user intents requiring unique sequences of actions constrained by policies to achieve task success. We propose two additional dialog tasks, Action State Tracking and Cascading Dialogue Success, and establish a series of baselines involving large-scale, pre-trained language models on this dataset. Empirical results demonstrate that while more sophisticated networks outperform simpler models, a considerable gap (50.8% absolute accuracy) still exists to reach human-level performance on ABCD.

* 16 pages, 5 figures. Accepted at NAACL 2021

Via

Access Paper or Ask Questions

UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

Apr 01, 2021
Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu, Jingjing Liu

Figure 1 for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

Figure 2 for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

Figure 3 for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

Figure 4 for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

Vision-and-language pre-training has achieved impressive success in learning multimodal representations between vision and language. To generalize this success to non-English languages, we introduce UC2, the first machine translation-augmented framework for cross-lingual cross-modal representation learning. To tackle the scarcity problem of multilingual captions for image datasets, we first augment existing English-only datasets with other languages via machine translation (MT). Then we extend the standard Masked Language Modeling and Image-Text Matching training objectives to multilingual setting, where alignment between different languages is captured through shared visual context (i.e, using image as pivot). To facilitate the learning of a joint embedding space of images and all languages of interest, we further propose two novel pre-training tasks, namely Masked Region-to-Token Modeling (MRTM) and Visual Translation Language Modeling (VTLM), leveraging MT-enhanced translated data. Evaluation on multilingual image-text retrieval and multilingual visual question answering benchmarks demonstrates that our proposed framework achieves new state-of-the-art on diverse non-English benchmarks while maintaining comparable performance to monolingual pre-trained models on English tasks.

Via

Access Paper or Ask Questions

Attribute Alignment: Controlling Text Generation from Pre-trained Language Models

Mar 20, 2021
Dian Yu, Kenji Sagae, Zhou Yu

Figure 1 for Attribute Alignment: Controlling Text Generation from Pre-trained Language Models

Figure 2 for Attribute Alignment: Controlling Text Generation from Pre-trained Language Models

Figure 3 for Attribute Alignment: Controlling Text Generation from Pre-trained Language Models

Figure 4 for Attribute Alignment: Controlling Text Generation from Pre-trained Language Models

Large language models benefit from training with a large amount of unlabeled text, which gives them increasingly fluent and diverse generation capabilities. However, using these models for text generation that takes into account target attributes, such as sentiment polarity or specific topics, remains a challenge. We propose a simple and flexible method for controlling text generation by aligning disentangled attribute representations. In contrast to recent efforts on training a discriminator to perturb the token level distribution for an attribute, we use the same data to learn an alignment function to guide the pre-trained, non-controlled language model to generate texts with the target attribute without changing the original language model parameters. We evaluate our method on sentiment- and topic-controlled generation, and show large performance gains over previous methods while retaining fluency and diversity.

Via

Access Paper or Ask Questions