Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

An Embedded Deep Learning based Word Prediction

Jul 06, 2017
Seunghak Yu, Nilesh Kulkarni, Haejun Lee, Jihie Kim

Recent developments in deep learning with application to language modeling have led to success in tasks of text processing, summarizing and machine translation. However, deploying huge language models for mobile device such as on-device keyboards poses computation as a bottle-neck due to their puny computation capacities. In this work we propose an embedded deep learning based word prediction method that optimizes run-time memory and also provides a real time prediction environment. Our model size is 7.40MB and has average prediction time of 6.47 ms. We improve over the existing methods for word prediction in terms of key stroke savings and word prediction rate.

* 5 pages, 3 figures, EMNLP 2017 submitted 

  Access Paper or Ask Questions

RETURNN: The RWTH Extensible Training framework for Universal Recurrent Neural Networks

Jan 10, 2017
Patrick Doetsch, Albert Zeyer, Paul Voigtlaender, Ilya Kulikov, Ralf Schlüter, Hermann Ney

In this work we release our extensible and easily configurable neural network training software. It provides a rich set of functional layers with a particular focus on efficient training of recurrent neural network topologies on multiple GPUs. The source of the software package is public and freely available for academic research purposes and can be used as a framework or as a standalone tool which supports a flexible configuration. The software allows to train state-of-the-art deep bidirectional long short-term memory (LSTM) models on both one dimensional data like speech or two dimensional data like handwritten text and was used to develop successful submission systems in several evaluation campaigns.

  Access Paper or Ask Questions

Surprisal-Driven Feedback in Recurrent Networks

Oct 19, 2016
Kamil M Rocki

Recurrent neural nets are widely used for predicting temporal data. Their inherent deep feedforward structure allows learning complex sequential patterns. It is believed that top-down feedback might be an important missing ingredient which in theory could help disambiguate similar patterns depending on broader context. In this paper we introduce surprisal-driven recurrent networks, which take into account past error information when making new predictions. This is achieved by continuously monitoring the discrepancy between most recent predictions and the actual observations. Furthermore, we show that it outperforms other stochastic and fully deterministic approaches on enwik8 character level prediction task achieving 1.37 BPC on the test portion of the text.

* ICLR 2017 submission, fixed some equations 

  Access Paper or Ask Questions

Topic Sensitive Neural Headline Generation

Aug 20, 2016
Lei Xu, Ziyun Wang, Ayana, Zhiyuan Liu, Maosong Sun

Neural models have recently been used in text summarization including headline generation. The model can be trained using a set of document-headline pairs. However, the model does not explicitly consider topical similarities and differences of documents. We suggest to categorizing documents into various topics so that documents within the same topic are similar in content and share similar summarization patterns. Taking advantage of topic information of documents, we propose topic sensitive neural headline generation model. Our model can generate more accurate summaries guided by document topics. We test our model on LCSTS dataset, and experiments show that our method outperforms other baselines on each topic and achieves the state-of-art performance.

  Access Paper or Ask Questions

Tweet2Vec: Character-Based Distributed Representations for Social Media

May 17, 2016
Bhuwan Dhingra, Zhong Zhou, Dylan Fitzpatrick, Michael Muehl, William W. Cohen

Text from social media provides a set of challenges that can cause traditional NLP approaches to fail. Informal language, spelling errors, abbreviations, and special characters are all commonplace in these posts, leading to a prohibitively large vocabulary size for word-level approaches. We propose a character composition model, tweet2vec, which finds vector-space representations of whole tweets by learning complex, non-local dependencies in character sequences. The proposed model outperforms a word-level baseline at predicting user-annotated hashtags associated with the posts, doing significantly better when the input contains many out-of-vocabulary words or unusual character sequences. Our tweet2vec encoder is publicly available.

* 6 pages, 2 figures, 4 tables, accepted as conference paper at ACL 2016 

  Access Paper or Ask Questions

What value do explicit high level concepts have in vision to language problems?

Apr 28, 2016
Qi Wu, Chunhua Shen, Lingqiao Liu, Anthony Dick, Anton van den Hengel

Much of the recent progress in Vision-to-Language (V2L) problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). This approach does not explicitly represent high-level semantic concepts, but rather seeks to progress directly from image features to text. We propose here a method of incorporating high-level concepts into the very successful CNN-RNN approach, and show that it achieves a significant improvement on the state-of-the-art performance in both image captioning and visual question answering. We also show that the same mechanism can be used to introduce external semantic information and that doing so further improves performance. In doing so we provide an analysis of the value of high level semantic information in V2L problems.

* Accepted to IEEE Conf. Computer Vision and Pattern Recognition 2016. Fixed title 

  Access Paper or Ask Questions

Mining Software Quality from Software Reviews: Research Trends and Open Issues

Feb 05, 2016
Issa Atoum, Ahmed Otoom

Software review text fragments have considerably valuable information about users experience. It includes a huge set of properties including the software quality. Opinion mining or sentiment analysis is concerned with analyzing textual user judgments. The application of sentiment analysis on software reviews can find a quantitative value that represents software quality. Although many software quality methods are proposed they are considered difficult to customize and many of them are limited. This article investigates the application of opinion mining as an approach to extract software quality properties. We found that the major issues of software reviews mining using sentiment analysis are due to software lifecycle and the diverse users and teams.

* International Journal of Computer Trends and Technology,Vol. 31, No. 2, Jan 2016 
* 11 pages 

  Access Paper or Ask Questions

Analysis of a Play by Means of CHAPLIN, the Characters and Places Interaction Network Software

Nov 22, 2015
A. C. Sparavigna, R. Marazzato

Recently, we have developed a software able of gathering information on social networks from written texts. This software, the CHAracters and PLaces Interaction Network (CHAPLIN) tool, is implemented in Visual Basic. By means of it, characters and places of a literary work can be extracted from a list of raw words. The software interface helps users to select their names out of this list. Setting some parameters, CHAPLIN creates a network where nodes represent characters/places and edges give their interactions. Nodes and edges are labelled by performances. In this paper, we propose to use CHAPLIN for the analysis a William Shakespeare's play, the famous 'Tragedy of Hamlet, Prince of Denmark'. Performances of characters in the play as a whole and in each act of it are given by graphs.

* International Journal of Sciences, 2015, 4(3):60-68 

  Access Paper or Ask Questions

Description of the Odin Event Extraction Framework and Rule Language

Sep 24, 2015
Marco A. Valenzuela-Escárcega, Gus Hahn-Powell, Mihai Surdeanu

This document describes the Odin framework, which is a domain-independent platform for developing rule-based event extraction models. Odin aims to be powerful (the rule language allows the modeling of complex syntactic structures) and robust (to recover from syntactic parsing errors, syntactic patterns can be freely mixed with surface, token-based patterns), while remaining simple (some domain grammars can be up and running in minutes), and fast (Odin processes over 100 sentences/second in a real-world domain with over 200 rules). Here we include a thorough definition of the Odin rule language, together with a description of the Odin API in the Scala language, which allows one to apply these rules to arbitrary texts.

  Access Paper or Ask Questions

Robust Subgraph Generation Improves Abstract Meaning Representation Parsing

Jun 10, 2015
Keenon Werling, Gabor Angeli, Christopher Manning

The Abstract Meaning Representation (AMR) is a representation for open-domain rich semantics, with potential use in fields like event extraction and machine translation. Node generation, typically done using a simple dictionary lookup, is currently an important limiting factor in AMR parsing. We propose a small set of actions that derive AMR subgraphs by transformations on spans of text, which allows for more robust learning of this stage. Our set of construction actions generalize better than the previous approach, and can be learned with a simple classifier. We improve on the previous state-of-the-art result for AMR parsing, boosting end-to-end performance by 3 F$_1$ on both the LDC2013E117 and LDC2014T12 datasets.

* To appear in ACL 2015 

  Access Paper or Ask Questions