Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Unsupervised Paraphrase Generation using Pre-trained Language Models

Jun 09, 2020
Chaitra Hegde, Shrikumar Patil

Large scale Pre-trained Language Models have proven to be very powerful approach in various Natural language tasks. OpenAI's GPT-2 \cite{radford2019language} is notable for its capability to generate fluent, well formulated, grammatically consistent text and for phrase completions. In this paper we leverage this generation capability of GPT-2 to generate paraphrases without any supervision from labelled data. We examine how the results compare with other supervised and unsupervised approaches and the effect of using paraphrases for data augmentation on downstream tasks such as classification. Our experiments show that paraphrases generated with our model are of good quality, are diverse and improves the downstream task performance when used for data augmentation.


  Access Paper or Ask Questions

Deep Hierarchical Classification for Category Prediction in E-commerce System

May 14, 2020
Dehong Gao, Wenjing Yang, Huiling Zhou, Yi Wei, Yi Hu, Hao Wang

In e-commerce system, category prediction is to automatically predict categories of given texts. Different from traditional classification where there are no relations between classes, category prediction is reckoned as a standard hierarchical classification problem since categories are usually organized as a hierarchical tree. In this paper, we address hierarchical category prediction. We propose a Deep Hierarchical Classification framework, which incorporates the multi-scale hierarchical information in neural networks and introduces a representation sharing strategy according to the category tree. We also define a novel combined loss function to punish hierarchical prediction losses. The evaluation shows that the proposed approach outperforms existing approaches in accuracy.

* 5pages, to be published in ECNLP workshop of ACL20 

  Access Paper or Ask Questions

Smart To-Do : Automatic Generation of To-Do Items from Emails

May 05, 2020
Sudipto Mukherjee, Subhabrata Mukherjee, Marcello Hasegawa, Ahmed Hassan Awadallah, Ryen White

Intelligent features in email service applications aim to increase productivity by helping people organize their folders, compose their emails and respond to pending tasks. In this work, we explore a new application, Smart-To-Do, that helps users with task management over emails. We introduce a new task and dataset for automatically generating To-Do items from emails where the sender has promised to perform an action. We design a two-stage process leveraging recent advances in neural text generation and sequence-to-sequence learning, obtaining BLEU and ROUGE scores of 0:23 and 0:63 for this task. To the best of our knowledge, this is the first work to address the problem of composing To-Do items from emails.

* 58th annual meeting of the Association for Computational Linguistics (ACL), 2020 

  Access Paper or Ask Questions

CLUE: A Chinese Language Understanding Evaluation Benchmark

Apr 14, 2020
Liang Xu, Xuanwei Zhang, Lu Li, Hai Hu, Chenjie Cao, Weitang Liu, Junyi Li, Yudong Li, Kai Sun, Yechen Xu, Yiming Cui, Cong Yu, Qianqian Dong, Yin Tian, Dian Yu, Bo Shi, Jun Zeng, Rongzhao Wang, Weijian Xie, Yanting Li, Yina Patterson, Zuoyu Tian, Yiwen Zhang, He Zhou, Shaoweihua Liu, Qipeng Zhao, Cong Yue, Xinrui Zhang, Zhengliang Yang, Zhenzhong Lan

We introduce CLUE, a Chinese Language Understanding Evaluation benchmark. It contains eight different tasks, including single-sentence classification, sentence pair classification, and machine reading comprehension. We evaluate CLUE on a number of existing full-network pre-trained models for Chinese. We also include a small hand-crafted diagnostic test set designed to probe specific linguistic phenomena using different models, some of which are unique to Chinese. Along with CLUE, we release a large clean crawled raw text corpus that can be used for model pre-training. We release CLUE, baselines and pre-training dataset on Github.

* 9 pages, 4 figures 

  Access Paper or Ask Questions

SPN-CNN: Boosting Sensor-Based Source Camera Attribution With Deep Learning

Feb 07, 2020
Matthias Kirchner, Cameron Johnson

We explore means to advance source camera identification based on sensor noise in a data-driven framework. Our focus is on improving the sensor pattern noise (SPN) extraction from a single image at test time. Where existing works suppress nuisance content with denoising filters that are largely agnostic to the specific SPN signal of interest, we demonstrate that a~deep learning approach can yield a more suitable extractor that leads to improved source attribution. A series of extensive experiments on various public datasets confirms the feasibility of our approach and its applicability to image manipulation localization and video source attribution. A critical discussion of potential pitfalls completes the text.

* Presented at the IEEE International Workshop on Information Forensics and Security (WIFS) 2019 

  Access Paper or Ask Questions

Aspect and Opinion Terms Extraction Using Double Embeddings and Attention Mechanism for Indonesian Hotel Reviews

Aug 19, 2019
Jordhy Fernando, Masayu Leylia Khodra, Ali Akbar Septiandri

Aspect and opinion terms extraction from review texts is one of the key tasks in aspect-based sentiment analysis. In order to extract aspect and opinion terms for Indonesian hotel reviews, we adapt double embeddings feature and attention mechanism that outperform the best system at SemEval 2015 and 2016. We conduct experiments using 4000 reviews to find the best configuration and show the influences of double embeddings and attention mechanism toward model performance. Using 1000 reviews for evaluation, we achieved F1-measure of 0.914 and 0.90 for aspect and opinion terms extraction in token and entity (term) level respectively.


  Access Paper or Ask Questions

Latent Space Factorisation and Manipulation via Matrix Subspace Projection

Jul 26, 2019
Xiao Li, Chenghua Lin, Chaozheng Wang, Frank Guerin

This paper proposes a novel method for factorising the information in the latent space of an autoencoder (AE), to improve the interpretability of the latent space and facilitate controlled generation. When trained on a dataset with labelled attributes we can produce a latent vector which separates information encoding the attributes from other characteristic information, and also disentangles the attribute information. This then allows us to manipulate each attribute of the latent representation individually without affecting others. Our method, matrix subspace projection, is simpler than the state of the art adversarial network approaches to latent space factorisation. We demonstrate the utility of the method for attribute manipulation tasks on the CelebA image dataset and the E2E text corpus.


  Access Paper or Ask Questions

A Corpus for Modeling User and Language Effects in Argumentation on Online Debating

Jun 26, 2019
Esin Durmus, Claire Cardie

Existing argumentation datasets have succeeded in allowing researchers to develop computational methods for analyzing the content, structure and linguistic features of argumentative text. They have been much less successful in fostering studies of the effect of "user" traits -- characteristics and beliefs of the participants -- on the debate/argument outcome as this type of user information is generally not available. This paper presents a dataset of 78, 376 debates generated over a 10-year period along with surprisingly comprehensive participant profiles. We also complete an example study using the dataset to analyze the effect of selected user traits on the debate outcome in comparison to the linguistic features typically employed in studies of this kind.


  Access Paper or Ask Questions

Federated Learning for Emoji Prediction in a Mobile Keyboard

Jun 11, 2019
Swaroop Ramaswamy, Rajiv Mathews, Kanishka Rao, Françoise Beaufays

We show that a word-level recurrent neural network can predict emoji from text typed on a mobile keyboard. We demonstrate the usefulness of transfer learning for predicting emoji by pretraining the model using a language modeling task. We also propose mechanisms to trigger emoji and tune the diversity of candidates. The model is trained using a distributed on-device learning framework called federated learning. The federated model is shown to achieve better performance than a server-trained model. This work demonstrates the feasibility of using federated learning to train production-quality models for natural language understanding tasks while keeping users' data on their devices.


  Access Paper or Ask Questions

<<
987
988
989
990
991
992
993
994
995
996
997
998
999
>>