Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Generating Persona-Consistent Dialogue Responses Using Deep Reinforcement Learning

Apr 30, 2020
Mohsen Mesgar, Edwin Simpson, Yue Wang, Iryna Gurevych

Recent transformer-based open-domain dialogue agents are trained by reference responses in a fully supervised scenario. Such agents often display inconsistent personalities as training data potentially contain contradictory responses to identical input utterances and no persona-relevant criteria are used in their training losses. We propose a novel approach to train transformer-based dialogue agents using actor-critic reinforcement learning. We define a new reward function to assess generated responses in terms of persona consistency, topic consistency, and fluency. Our reference-agnostic reward relies only on a dialogue history and a persona defined by a list of facts. Automatic and human evaluations on the PERSONACHAT dataset show that our proposed approach increases the rate of persona-consistent responses compared with its peers that are trained in a fully supervised scenario using reference responses.

  Access Paper or Ask Questions

Towards Prediction Explainability through Sparse Communication

Apr 28, 2020
Marcos V. Treviso, André F. T. Martins

Explainability is a topic of growing importance in NLP. In this work, we provide a unified perspective of explainability as a communication problem between an explainer and a layperson about a classifier's decision. We use this framework to compare several prior approaches for extracting explanations, including gradient methods, representation erasure, and attention mechanisms, in terms of their communication success. In addition, we reinterpret these methods at the light of classical feature selection, and we use this as inspiration to propose new embedded methods for explainability, through the use of selective, sparse attention. Experiments in text classification, natural language entailment, and machine translation, using different configurations of explainers and laypeople (including both machines and humans), reveal an advantage of attention-based explainers over gradient and erasure methods. Furthermore, human evaluation experiments show promising results with post-hoc explainers trained to optimize communication success and faithfulness.

  Access Paper or Ask Questions

Do sequence-to-sequence VAEs learn global features of sentences?

Apr 16, 2020
Tom Bosc, Pascal Vincent

A longstanding goal in NLP is to compute global sentence representations. Such representations would be useful for sample-efficient semi-supervised learning and controllable text generation. To learn to represent global and local information separately, Bowman & al. (2016) proposed to train a sequence-to-sequence model with the variational auto-encoder (VAE) objective. What precisely is encoded in these latent variables expected to capture global features? We measure which words benefit most from the latent information by decomposing the reconstruction loss per position in the sentence. Using this method, we see that VAEs are prone to memorizing the first words and the sentence length, drastically limiting their usefulness. To alleviate this, we propose variants based on bag-of-words assumptions and language model pretraining. These variants learn latents that are more global: they are more predictive of topic or sentiment labels, and their reconstructions are more faithful to the labels of the original documents.

  Access Paper or Ask Questions

The Enron Corpus: Where the Email Bodies are Buried?

Jan 24, 2020
David Noever

To probe the largest public-domain email database for indicators of fraud, we apply machine learning and accomplish four investigative tasks. First, we identify persons of interest (POI), using financial records and email, and report a peak accuracy of 95.7%. Secondly, we find any publicly exposed personally identifiable information (PII) and discover 50,000 previously unreported instances. Thirdly, we automatically flag legally responsive emails as scored by human experts in the California electricity blackout lawsuit, and find a peak 99% accuracy. Finally, we track three years of primary topics and sentiment across over 10,000 unique people before, during and after the onset of the corporate crisis. Where possible, we compare accuracy against execution times for 51 algorithms and report human-interpretable business rules that can scale to vast datasets.

  Access Paper or Ask Questions

Towards Generating Explanations for ASP-Based Link Analysis using Declarative Program Transformations

Sep 08, 2019
Martin Atzmueller, Cicek Güven, Dietmar Seipel

The explication and the generation of explanations are prominent topics in artificial intelligence and data science, in order to make methods and systems more transparent and understandable for humans. This paper investigates the problem of link analysis, specifically link prediction and anomalous link discovery in social networks using the declarative method of Answer set programming (ASP). Applying ASP for link prediction provides a powerful declarative approach, e.g., for incorporating domain knowledge for explicative prediction. In this context, we propose a novel method for generating explanations - as offline justifications - using declarative program transformations. The method itself is purely based on syntactic transformations of declarative programs, e.g., in an ASP formalism, using rule instrumentation. We demonstrate the efficacy of the proposed approach, exemplifying it in an application on link analysis in social networks, also including domain knowledge.

* Part of DECLARE 19 proceedings 

  Access Paper or Ask Questions

Unsupervised Segmentation of Hyperspectral Images Using 3D Convolutional Autoencoders

Jul 20, 2019
Jakub Nalepa, Michal Myller, Yasuteru Imai, Ken-ichi Honda, Tomomi Takeda, Marek Antoniak

Hyperspectral image analysis has become an important topic widely researched by the remote sensing community. Classification and segmentation of such imagery help understand the underlying materials within a scanned scene, since hyperspectral images convey a detailed information captured in a number of spectral bands. Although deep learning has established the state of the art in the field, it still remains challenging to train well-generalizing models due to the lack of ground-truth data. In this letter, we tackle this problem and propose an end-to-end approach to segment hyperspectral images in a fully unsupervised way. We introduce a new deep architecture which couples 3D convolutional autoencoders with clustering. Our multi-faceted experimental study---performed over benchmark and real-life data---revealed that our approach delivers high-quality segmentation without any prior class labels.

* Submitted to IEEE Geoscience and Remote Sensing Letters 

  Access Paper or Ask Questions

A Spatial-temporal 3D Human Pose Reconstruction Framework

Jan 10, 2019
X. T. Nguyen, T. D. Ngo, T. H. Le

3D human pose reconstruction from single-view camera is a difficult and challenging topic. Many approaches have been proposed, but almost focusing on frame-by-frame independently while inter-frames are highly correlated in a pose sequence. In contrast, we introduce a novel spatial-temporal 3D reconstruction framework that leverages both intra and inter frame relationships in consecutive 2D pose sequences. Orthogonal Matching Pursuit (OMP) algorithm, pre-trained Pose-angle Limits and Temporal Models have been implemented. We quantitatively compare our framework versus recent works on CMU motion capture dataset and Vietnamese traditional dance sequences. Our method outperforms others with 10 percent lower of Euclidean reconstruction error and robustness against Gaussian noise. Additionally, it is also important to mention that our reconstructed 3D pose sequences are smoother and more natural than others.

* 10 pages. JIPS Journal 2018 

  Access Paper or Ask Questions

Hessian-Aware Zeroth-Order Optimization for Black-Box Adversarial Attack

Dec 29, 2018
Haishan Ye, Zhichao Huang, Cong Fang, Chris Junchi Li, Tong Zhang

Zeroth-order optimization or derivative-free optimization is an important research topic in machine learning. In recent, it has become a key tool in black-box adversarial attack to neural network based image classifiers. However, existing zeroth-order optimization algorithms rarely extract Hessian information of the model function. In this paper, we utilize the second-order information of the objective function and propose a novel \emph{Hessian-aware zeroth-order algorithm} called \texttt{ZO-HessAware}. Our theoretical result shows that \texttt{ZO-HessAware} has an improved zeroth-order convergence rate and query complexity under structured Hessian approximation, where we propose a few approximation methods of such. Our empirical studies on the black-box adversarial attack problem validate that our algorithm can achieve improved success rates with a lower query complexity.

  Access Paper or Ask Questions

Impact of Intervals on the Emotional Effect in Western Music

Dec 10, 2018
Cengiz Kaygusuz, Julian Zuluaga

Every art form ultimately aims to invoke an emotional response over the audience, and music is no different. While the precise perception of music is a highly subjective topic, there is an agreement in the "feeling" of a piece of music in broad terms. Based on this observation, in this study, we aimed to determine the emotional feeling associated with short passages of music; specifically by analyzing the melodic aspects. We have used the dataset put together by Eerola et. al. which is comprised of labeled short passages of film music. Our initial survey of the dataset indicated that other than "happy" and "sad" labels do not possess a melodic structure. We transcribed the main melody of the happy and sad tracks and used the intervals between the notes to classify them. Our experiments have shown that treating a melody as a bag-of-intervals do not possess any predictive power whatsoever, whereas counting intervals with respect to the key of the melody yielded a classifier with 85% accuracy.

  Access Paper or Ask Questions

Convex Relaxation Methods for Community Detection

Sep 30, 2018
Xiaodong Li, Yudong Chen, Jiaming Xu

This paper surveys recent theoretical advances in convex optimization approaches for community detection. We introduce some important theoretical techniques and results for establishing the consistency of convex community detection under various statistical models. In particular, we discuss the basic techniques based on the primal and dual analysis. We also present results that demonstrate several distinctive advantages of convex community detection, including robustness against outlier nodes, consistency under weak assortativity, and adaptivity to heterogeneous degrees. This survey is not intended to be a complete overview of the vast literature on this fast-growing topic. Instead, we aim to provide a big picture of the remarkable recent development in this area and to make the survey accessible to a broad audience. We hope that this expository article can serve as an introductory guide for readers who are interested in using, designing, and analyzing convex relaxation methods in network analysis.

* 22 pages 

  Access Paper or Ask Questions