Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Improved Goal Oriented Dialogue via Utterance Generation and Look Ahead

Oct 24, 2021
Eyal Ben-David, Boaz Carmeli, Ateret Anaby-Tavor

Goal oriented dialogue systems have become a prominent customer-care interaction channel for most businesses. However, not all interactions are smooth, and customer intent misunderstanding is a major cause of dialogue failure. We show that intent prediction can be improved by training a deep text-to-text neural model to generate successive user utterances from unlabeled dialogue data. For that, we define a multi-task training regime that utilizes successive user-utterance generation to improve the intent prediction. Our approach achieves the reported improvement due to two complementary factors: First, it uses a large amount of unlabeled dialogue data for an auxiliary generation task. Second, it uses the generated user utterance as an additional signal for the intent prediction model. Lastly, we present a novel look-ahead approach that uses user utterance generation to improve intent prediction in inference time. Specifically, we generate counterfactual successive user utterances for conversations with ambiguous predicted intents, and disambiguate the prediction by reassessing the concatenated sequence of available and generated utterances.

  Access Paper or Ask Questions

Discontinuous Grammar as a Foreign Language

Oct 20, 2021
Daniel Fernández-González, Carlos Gómez-Rodríguez

In order to achieve deep natural language understanding, syntactic constituent parsing is a vital step, highly demanded by many artificial intelligence systems to process both text and speech. One of the most recent proposals is the use of standard sequence-to-sequence models to perform constituent parsing as a machine translation task, instead of applying task-specific parsers. While they show a competitive performance, these text-to-parse transducers are still lagging behind classic techniques in terms of accuracy, coverage and speed. To close the gap, we here extend the framework of sequence-to-sequence models for constituent parsing, not only by providing a more powerful neural architecture for improving their performance, but also by enlarging their coverage to handle the most complex syntactic phenomena: discontinuous structures. To that end, we design several novel linearizations that can fully produce discontinuities and, for the first time, we test a sequence-to-sequence model on the main discontinuous benchmarks, obtaining competitive results on par with task-specific discontinuous constituent parsers and achieving state-of-the-art scores on the (discontinuous) English Penn Treebank.

* 22 pages 

  Access Paper or Ask Questions

How Well Do You Know Your Audience? Reader-aware Question Generation

Oct 16, 2021
Ian Stewart, Rada Mihalcea

When writing, a person may need to anticipate questions from their readers, but different types of readers may ask very different types of questions. If someone is writing for advice about a problem, what question will a domain expert ask, and is this different from how a novice might react? In this paper, we address the task of reader-aware question generation. We collect a new data set of questions and posts from social media, augmented with background information about the post readers. Based on predictive analysis and descriptive differences, we find that different readers, such as experts and novices, consistently ask different types of questions. We next develop several text generation models that incorporate different types of reader background, including discrete and continuous reader representations based on the readers' prior behavior. We demonstrate that reader-aware models can perform on par or slightly better than the text-only model in some cases, particularly in cases where a post attracts very different questions from readers of different groups. Our work has the potential to help writers anticipate the information needs of different readers.

  Access Paper or Ask Questions

Human in the Loop for Machine Creativity

Oct 07, 2021
Neo Christopher Chung

Artificial intelligence (AI) is increasingly utilized in synthesizing visuals, texts, and audio. These AI-based works, often derived from neural networks, are entering the mainstream market, as digital paintings, songs, books, and others. We conceptualize both existing and future human-in-the-loop (HITL) approaches for creative applications and to develop more expressive, nuanced, and multimodal models. Particularly, how can our expertise as curators and collaborators be encoded in AI models in an interactive manner? We examine and speculate on long term implications for models, interfaces, and machine creativity. Our selection, creation, and interpretation of AI art inherently contain our emotional responses, cultures, and contexts. Therefore, the proposed HITL may help algorithms to learn creative processes that are much harder to codify or quantify. We envision multimodal HITL processes, where texts, visuals, sounds, and other information are coupled together, with automated analysis of humans and environments. Overall, these HITL approaches will increase interaction between human and AI, and thus help the future AI systems to better understand our own creative and emotional processes.

* 9th AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2021), Blue Sky Ideas track 

  Access Paper or Ask Questions

Integrating Categorical Features in End-to-End ASR

Oct 06, 2021
Rongqing Huang

All-neural, end-to-end ASR systems gained rapid interest from the speech recognition community. Such systems convert speech input to text units using a single trainable neural network model. E2E models require large amounts of paired speech text data that is expensive to obtain. The amount of data available varies across different languages and dialects. It is critical to make use of all these data so that both low resource languages and high resource languages can be improved. When we want to deploy an ASR system for a new application domain, the amount of domain specific training data is very limited. To be able to leverage data from existing domains is important for ASR accuracy in the new domain. In this paper, we treat all these aspects as categorical information in an ASR system, and propose a simple yet effective way to integrate categorical features into E2E model. We perform detailed analysis on various training strategies, and find that building a joint model that includes categorical features can be more accurate than multiple independently trained models.

* Submitted to ICASSP 2022 

  Access Paper or Ask Questions

Technological Approaches to Detecting Online Disinformation and Manipulation

Aug 26, 2021
Aleš Horák, Vít Baisa, Ondřej Herman

The move of propaganda and disinformation to the online environment is possible thanks to the fact that within the last decade, digital information channels radically increased in popularity as a news source. The main advantage of such media lies in the speed of information creation and dissemination. This, on the other hand, inevitably adds pressure, accelerating editorial work, fact-checking, and the scrutiny of source credibility. In this chapter, an overview of computer-supported approaches to detecting disinformation and manipulative techniques based on several criteria is presented. We concentrate on the technical aspects of automatic methods which support fact-checking, topic identification, text style analysis, or message filtering on social media channels. Most of the techniques employ artificial intelligence and machine learning with feature extraction combining available information resources. The following text firstly specifies the tasks related to computer detection of manipulation and disinformation spreading. The second section presents concrete methods of solving the tasks of the analysis, and the third sections enlists current verification and benchmarking datasets published and used in this area for evaluation and comparison.

* This is an author preprint of the 5th chapter in the book of "Challenging Online Propaganda and Disinformation in the 21st Century" published by Palgrave Macmillan at 

  Access Paper or Ask Questions

Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation

Apr 13, 2021
Hirofumi Inaguma, Tatsuya Kawahara, Shinji Watanabe

A conventional approach to improving the performance of end-to-end speech translation (E2E-ST) models is to leverage the source transcription via pre-training and joint training with automatic speech recognition (ASR) and neural machine translation (NMT) tasks. However, since the input modalities are different, it is difficult to leverage source language text successfully. In this work, we focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models. To leverage the full potential of the source language information, we propose backward SeqKD, SeqKD from a target-to-source backward NMT model. To this end, we train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder. The paraphrases are generated from the translations in bitext via back-translation. We further propose bidirectional SeqKD in which SeqKD from both forward and backward NMT models is combined. Experimental evaluations on both autoregressive and non-autoregressive models show that SeqKD in each direction consistently improves the translation performance, and the effectiveness is complementary regardless of the model capacity.

* Accepted at NAACL-HLT 2021 (short paper) 

  Access Paper or Ask Questions

Exploring Knowledge Distillation of a Deep Neural Network for Multi-Script identification

Feb 20, 2021
Shuvayan Ghosh Dastidar, Kalpita Dutta, Nibaran Das, Mahantapas Kundu, Mita Nasipuri

Multi-lingual script identification is a difficult task consisting of different language with complex backgrounds in scene text images. According to the current research scenario, deep neural networks are employed as teacher models to train a smaller student network by utilizing the teacher model's predictions. This process is known as dark knowledge transfer. It has been quite successful in many domains where the final result obtained is unachievable through directly training the student network with a simple architecture. In this paper, we explore dark knowledge transfer approach using long short-term memory(LSTM) and CNN based assistant model and various deep neural networks as the teacher model, with a simple CNN based student network, in this domain of multi-script identification from natural scene text images. We explore the performance of different teacher models and their ability to transfer knowledge to a student network. Although the small student network's limited size, our approach obtains satisfactory results on a well-known script identification dataset CVSI-2015.

* 14 pages, 6 figures, 7 tables 

  Access Paper or Ask Questions

Belief-based Generation of Argumentative Claims

Jan 26, 2021
Milad Alshomary, Wei-Fan Chen, Timon Gurcke, Henning Wachsmuth

When engaging in argumentative discourse, skilled human debaters tailor claims to the beliefs of the audience, to construct effective arguments. Recently, the field of computational argumentation witnessed extensive effort to address the automatic generation of arguments. However, existing approaches do not perform any audience-specific adaptation. In this work, we aim to bridge this gap by studying the task of belief-based claim generation: Given a controversial topic and a set of beliefs, generate an argumentative claim tailored to the beliefs. To tackle this task, we model the people's prior beliefs through their stances on controversial topics and extend state-of-the-art text generation models to generate claims conditioned on the beliefs. Our automatic evaluation confirms the ability of our approach to adapt claims to a set of given beliefs. In a manual study, we additionally evaluate the generated claims in terms of informativeness and their likelihood to be uttered by someone with a respective belief. Our results reveal the limitations of modeling users' beliefs based on their stances, but demonstrate the potential of encoding beliefs into argumentative texts, laying the ground for future exploration of audience reach.

* Almost 9 pages, 1 figure, EACL-21 paper 

  Access Paper or Ask Questions

Misspelling Correction with Pre-trained Contextual Language Model

Jan 08, 2021
Yifei Hu, Xiaonan Jing, Youlim Ko, Julia Taylor Rayz

Spelling irregularities, known now as spelling mistakes, have been found for several centuries. As humans, we are able to understand most of the misspelled words based on their location in the sentence, perceived pronunciation, and context. Unlike humans, computer systems do not possess the convenient auto complete functionality of which human brains are capable. While many programs provide spelling correction functionality, many systems do not take context into account. Moreover, Artificial Intelligence systems function in the way they are trained on. With many current Natural Language Processing (NLP) systems trained on grammatically correct text data, many are vulnerable against adversarial examples, yet correctly spelled text processing is crucial for learning. In this paper, we investigate how spelling errors can be corrected in context, with a pre-trained language model BERT. We present two experiments, based on BERT and the edit distance algorithm, for ranking and selecting candidate corrections. The results of our experiments demonstrated that when combined properly, contextual word embeddings of BERT and edit distance are capable of effectively correcting spelling errors.

* Accepted by 2020 IEEE 19th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE 

  Access Paper or Ask Questions