Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"chatbots": models, code, and papers

Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning

Oct 12, 2021
Tosin Adewumi, Nosheen Abid, Maryam Pahlavan, Rickard Brännvall, Sana Sabah Sabry, Foteini Liwicki, Marcus Liwicki

Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English. This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources. Perplexity score (an automated intrinsic language model metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models, with results that indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogue judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. We provide the demos and model checkpoints of our English and Swedish chatbots on the HuggingFace platform for public use.

* 9 pages, 5 tables, 1 figure 

Short Text Conversation Based on Deep Neural Network and Analysis on Evaluation Measures

Jul 06, 2019
Hsiang-En Cherng, Chia-Hui Chang

With the development of Natural Language Processing, Automatic question-answering system such as Waston, Siri, Alexa, has become one of the most important NLP applications. Nowadays, enterprises try to build automatic custom service chatbots to save human resources and provide a 24-hour customer service. Evaluation of chatbots currently relied greatly on human annotation which cost a plenty of time. Thus, has initiated a new Short Text Conversation subtask called Dialogue Quality (DQ) and Nugget Detection (ND) which aim to automatically evaluate dialogues generated by chatbots. In this paper, we solve the DQ and ND subtasks by deep neural network. We proposed two models for both DQ and ND subtasks which is constructed by hierarchical structure: embedding layer, utterance layer, context layer and memory layer, to hierarchical learn dialogue representation from word level, sentence level, context level to long range context level. Furthermore, we apply gating and attention mechanism at utterance layer and context layer to improve the performance. We also tried BERT to replace embedding layer and utterance layer as sentence representation. The result shows that BERT produced a better utterance representation than multi-stack CNN for both DQ and ND subtasks and outperform other models proposed by other researches. The evaluation measures are proposed by , that is, NMD, RSNOD for DQ and JSD, RNSS for ND, which is not traditional evaluation measures such as accuracy, precision, recall and f1-score. Thus, we have done a series of experiments by using traditional evaluation measures and analyze the performance and error.

* 8 pages, 5 figures 

Towards Ethical Machines Via Logic Programming

Sep 18, 2019
Abeer Dyoub, Stefania Costantini, Francesca A. Lisi

Autonomous intelligent agents are playing increasingly important roles in our lives. They contain information about us and start to perform tasks on our behalves. Chatbots are an example of such agents that need to engage in a complex conversations with humans. Thus, we need to ensure that they behave ethically. In this work we propose a hybrid logic-based approach for ethical chatbots.

* EPTCS 306, 2019, pp. 333-339 
* In Proceedings ICLP 2019, arXiv:1909.07646 

HINT3: Raising the bar for Intent Detection in the Wild

Oct 10, 2020
Gaurav Arora, Chirag Jain, Manas Chaturvedi, Krupal Modi

Intent Detection systems in the real world are exposed to complexities of imbalanced datasets containing varying perception of intent, unintended correlations and domain-specific aberrations. To facilitate benchmarking which can reflect near real-world scenarios, we introduce 3 new datasets created from live chatbots in diverse domains. Unlike most existing datasets that are crowdsourced, our datasets contain real user queries received by the chatbots and facilitates penalising unwanted correlations grasped during the training process. We evaluate 4 NLU platforms and a BERT based classifier and find that performance saturates at inadequate levels on test sets because all systems latch on to unintended patterns in training data.

* Accepted at EMNLP-2020's Insights workshop 

A Unified Framework for Emotion Identification and Generation in Dialogues

May 31, 2022
Avinash Madasu, Mauajama Firdaus, Asif Eqbal

Social chatbots have gained immense popularity, and their appeal lies not just in their capacity to respond to the diverse requests from users, but also in the ability to develop an emotional connection with users. To further develop and promote social chatbots, we need to concentrate on increasing user interaction and take into account both the intellectual and emotional quotient in the conversational agents. In this paper, we propose a multi-task framework that jointly identifies the emotion of a given dialogue and generates response in accordance to the identified emotion. We employ a BERT based network for creating an empathetic system and use a mixed objective function that trains the end-to-end network with both the classification and generation loss. Experimental results show that our proposed framework outperforms current state-of-the-art models


ConveRT for FAQ Answering

Aug 03, 2021
Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, Walter Daelemans

Knowledgeable FAQ chatbots are a valuable resource to any organization. Unlike traditional call centers or FAQ web pages, they provide instant responses and are always available. Our experience running a COVID19 chatbot revealed the lack of resources available for FAQ answering in non-English languages. While powerful and efficient retrieval-based models exist for English, it is rarely the case for other languages which do not have the same amount of training data available. In this work, we propose a novel pretaining procedure to adapt ConveRT, an English SOTA conversational agent, to other languages with less training data available. We apply it for the first time to the task of Dutch FAQ answering related to the COVID19 vaccine. We show it performs better than an open-source alternative in a low-data regime and high-data regime.


Learning to mirror speaking styles incrementally

Mar 05, 2020
Siyi Liu, Ziang Leng, Derry Wijaya

Mirroring is the behavior in which one person subconsciously imitates the gesture, speech pattern, or attitude of another. In conversations, mirroring often signals the speakers enjoyment and engagement in their communication. In chatbots, methods have been proposed to add personas to the chatbots and to train them to speak or to shift their dialogue style to that of the personas. However, they often require a large dataset consisting of dialogues of the target personalities to train. In this work, we explore a method that can learn to mirror the speaking styles of a person incrementally. Our method extracts ngrams that capture a persons speaking styles and uses the ngrams to create patterns for transforming sentences to the persons speaking styles. Our experiments show that our method is able to capture patterns of speaking style that can be used to transform regular sentences into sentences with the target style.

* 4 pages, 3 tables, 1 figure 

Self-Attentional Models Application in Task-Oriented Dialogue Generation Systems

Sep 11, 2019
Mansour Saffar Mehrjardi, Amine Trabelsi, Osmar R. Zaiane

Self-attentional models are a new paradigm for sequence modelling tasks which differ from common sequence modelling methods, such as recurrence-based and convolution-based sequence learning, in the way that their architecture is only based on the attention mechanism. Self-attentional models have been used in the creation of the state-of-the-art models in many NLP tasks such as neural machine translation, but their usage has not been explored for the task of training end-to-end task-oriented dialogue generation systems yet. In this study, we apply these models on the three different datasets for training task-oriented chatbots. Our finding shows that self-attentional models can be exploited to create end-to-end task-oriented chatbots which not only achieve higher evaluation scores compared to recurrence-based models, but also do so more efficiently.

* Appeared in proceedings of Recent Advances in Natural Language Processing (RANLP) Conference, 2019 

Combining Textual Content and Structure to Improve Dialog Similarity

Feb 20, 2018
Ana Paula Appel, Paulo Rodrigo Cavalin, Marisa Affonso Vasconcelos, Claudio Santos Pinhanez

Chatbots, taking advantage of the success of the messaging apps and recent advances in Artificial Intelligence, have become very popular, from helping business to improve customer services to chatting to users for the sake of conversation and engagement (celebrity or personal bots). However, developing and improving a chatbot requires understanding their data generated by its users. Dialog data has a different nature of a simple question and answering interaction, in which context and temporal properties (turn order) creates a different understanding of such data. In this paper, we propose a novelty metric to compute dialogs' similarity based not only on the text content but also on the information related to the dialog structure. Our experimental results performed over the Switchboard dataset show that using evidence from both textual content and the dialog structure leads to more accurate results than using each measure in isolation.

* 5 pages