Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Adversarial Multi-Binary Neural Network for Multi-class Classification

Mar 25, 2020
Haiyang Xu, Junwen Chen, Kun Han, Xiangang Li

Multi-class text classification is one of the key problems in machine learning and natural language processing. Emerging neural networks deal with the problem using a multi-output softmax layer and achieve substantial progress, but they do not explicitly learn the correlation among classes. In this paper, we use a multi-task framework to address multi-class classification, where a multi-class classifier and multiple binary classifiers are trained together. Moreover, we employ adversarial training to distinguish the class-specific features and the class-agnostic features. The model benefits from better feature representation. We conduct experiments on two large-scale multi-class text classification tasks and demonstrate that the proposed architecture outperforms baseline approaches.

  Access Paper or Ask Questions

Features in Extractive Supervised Single-document Summarization: Case of Persian News

Sep 09, 2019
Hosein Rezaei, Seyed Amid Moeinzadeh, Azar Shahgholian, Mohamad Saraee

Text summarization has been one of the most challenging areas of research in NLP. Much effort has been made to overcome this challenge by using either the abstractive or extractive methods. Extractive methods are more popular, due to their simplicity compared with the more elaborate abstractive methods. In extractive approaches, the system will not generate sentences. Instead, it learns how to score sentences within the text by using some textual features and subsequently selecting those with the highest-rank. Therefore, the core objective is ranking and it highly depends on the document. This dependency has been unnoticed by many state-of-the-art solutions. In this work, the features of the document are integrated into vectors of every sentence. In this way, the system becomes informed about the context, increases the precision of the learned model and consequently produces comprehensive and brief summaries.

  Access Paper or Ask Questions

A Survey of Reinforcement Learning Informed by Natural Language

Jun 10, 2019
Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel

To be successful in real-world tasks, Reinforcement Learning (RL) needs to exploit the compositional, relational, and hierarchical structure of the world, and learn to transfer it to the task at hand. Recent advances in representation learning for language make it possible to build models that acquire world knowledge from text corpora and integrate this knowledge into downstream decision making problems. We thus argue that the time is right to investigate a tight integration of natural language understanding into RL in particular. We survey the state of the field, including work on instruction following, text games, and learning from textual domain knowledge. Finally, we call for the development of new environments as well as further investigation into the potential uses of recent Natural Language Processing (NLP) techniques for such tasks.

* Published at IJCAI'19 

  Access Paper or Ask Questions

Neural Language Modeling with Visual Features

Mar 07, 2019
Antonios Anastasopoulos, Shankar Kumar, Hank Liao

Multimodal language models attempt to incorporate non-linguistic features for the language modeling task. In this work, we extend a standard recurrent neural network (RNN) language model with features derived from videos. We train our models on data that is two orders-of-magnitude bigger than datasets used in prior work. We perform a thorough exploration of model architectures for combining visual and text features. Our experiments on two corpora (YouCookII and 20bn-something-something-v2) show that the best performing architecture consists of middle fusion of visual and text features, yielding over 25% relative improvement in perplexity. We report analysis that provides insights into why our multimodal language model improves upon a standard RNN language model.

  Access Paper or Ask Questions

Diacritization of Maghrebi Arabic Sub-Dialects

Oct 29, 2018
Ahmed Abdelali, Mohammed Attia, Younes Samihy, Kareem Darwish, Hamdy Mubarak

Diacritization process attempt to restore the short vowels in Arabic written text; which typically are omitted. This process is essential for applications such as Text-to-Speech (TTS). While diacritization of Modern Standard Arabic (MSA) still holds the lion share, research on dialectal Arabic (DA) diacritization is very limited. In this paper, we present our contribution and results on the automatic diacritization of two sub-dialects of Maghrebi Arabic, namely Tunisian and Moroccan, using a character-level deep neural network architecture that stacks two bi-LSTM layers over a CRF output layer. The model achieves word error rate of 2.7% and 3.6% for Moroccan and Tunisian respectively and is capable of implicitly identifying the sub-dialect of the input.

* 6 pages, 3 figures 

  Access Paper or Ask Questions

Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder

Jul 25, 2018
Caio Corro, Ivan Titov

Human annotation for syntactic parsing is expensive, and large resources are available only for a fraction of languages. A question we ask is whether one can leverage abundant unlabeled texts to improve syntactic parsers, beyond just using the texts to obtain more generalisable lexical features (i.e. beyond word embeddings). To this end, we propose a novel latent-variable generative model for semi-supervised syntactic dependency parsing. As exact inference is intractable, we introduce a differentiable relaxation to obtain approximate samples and compute gradients with respect to the parser parameters. Our method (Differentiable Perturb-and-Parse) relies on differentiable dynamic programming over stochastically perturbed edge scores. We demonstrate effectiveness of our approach with experiments on English, French and Swedish.

  Access Paper or Ask Questions

Acquiring Background Knowledge to Improve Moral Value Prediction

Sep 16, 2017
Ying Lin, Joe Hoover, Morteza Dehghani, Marlon Mooijman, Heng Ji

In this paper, we address the problem of detecting expressions of moral values in tweets using content analysis. This is a particularly challenging problem because moral values are often only implicitly signaled in language, and tweets contain little contextual information due to length constraints. To address these obstacles, we present a novel approach to automatically acquire background knowledge from an external knowledge base to enrich input texts and thus improve moral value prediction. By combining basic text features with background knowledge, our overall context-aware framework achieves performance comparable to a single human annotator. To the best of our knowledge, this is the first attempt to incorporate background knowledge for the prediction of implicit psychological variables in the area of computational social science.

* 8 pages, 4 figures 

  Access Paper or Ask Questions

Emotion Detection on TV Show Transcripts with Sequence-based Convolutional Neural Networks

Aug 14, 2017
Sayyed M. Zahiri, Jinho D. Choi

While there have been significant advances in detecting emotions from speech and image recognition, emotion detection on text is still under-explored and remained as an active research field. This paper introduces a corpus for text-based emotion detection on multiparty dialogue as well as deep neural models that outperform the existing approaches for document classification. We first present a new corpus that provides annotation of seven emotions on consecutive utterances in dialogues extracted from the show, Friends. We then suggest four types of sequence-based convolutional neural network models with attention that leverage the sequence information encapsulated in dialogue. Our best model shows the accuracies of 37.9% and 54% for fine- and coarse-grained emotions, respectively. Given the difficulty of this task, this is promising.

  Access Paper or Ask Questions

See, Hear, and Read: Deep Aligned Representations

Jun 03, 2017
Yusuf Aytar, Carl Vondrick, Antonio Torralba

We capitalize on large amounts of readily-available, synchronous data to learn a deep discriminative representations shared across three major natural modalities: vision, sound and language. By leveraging over a year of sound from video and millions of sentences paired with images, we jointly train a deep convolutional network for aligned representation learning. Our experiments suggest that this representation is useful for several tasks, such as cross-modal retrieval or transferring classifiers between modalities. Moreover, although our network is only trained with image+text and image+sound pairs, it can transfer between text and sound as well, a transfer the network never observed during training. Visualizations of our representation reveal many hidden units which automatically emerge to detect concepts, independent of the modality.

  Access Paper or Ask Questions

Talking Face Generation with Multilingual TTS

May 13, 2022
Hyoung-Kyu Song, Sang Hoon Woo, Junhyeok Lee, Seungmin Yang, Hyunjae Cho, Youseong Lee, Dongho Choi, Kang-wook Kim

In this work, we propose a joint system combining a talking face generation system with a text-to-speech system that can generate multilingual talking face videos from only the text input. Our system can synthesize natural multilingual speeches while maintaining the vocal identity of the speaker, as well as lip movements synchronized to the synthesized speech. We demonstrate the generalization capabilities of our system by selecting four languages (Korean, English, Japanese, and Chinese) each from a different language family. We also compare the outputs of our talking face generation model to outputs of a prior work that claims multilingual support. For our demo, we add a translation API to the preprocessing stage and present it in the form of a neural dubber so that users can utilize the multilingual property of our system more easily.

* Accepted to CVPR Demo Track (2022) 

  Access Paper or Ask Questions