Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Do Massively Pretrained Language Models Make Better Storytellers?

Sep 24, 2019
Abigail See, Aneesh Pappu, Rohun Saxena, Akhila Yerukola, Christopher D. Manning

Large neural language models trained on massive amounts of text have emerged as a formidable strategy for Natural Language Understanding tasks. However, the strength of these models as Natural Language Generators is less clear. Though anecdotal evidence suggests that these models generate better quality text, there has been no detailed study characterizing their generation abilities. In this work, we compare the performance of an extensively pretrained model, OpenAI GPT2-117 (Radford et al., 2019), to a state-of-the-art neural story generation model (Fan et al., 2018). By evaluating the generated text across a wide variety of automatic metrics, we characterize the ways in which pretrained models do, and do not, make better storytellers. We find that although GPT2-117 conditions more strongly on context, is more sensitive to ordering of events, and uses more unusual words, it is just as likely to produce repetitive and under-diverse text when using likelihood-maximizing decoding algorithms.

* Accepted to CoNLL 2019 

  Access Paper or Ask Questions

Detection of texts in natural images

Nov 01, 2014
Gowtham Rangarajan Raman

A framework that makes use of Connected components and supervised Support machine to recognise texts is proposed. The image is preprocessed and and edge graph is calculated using a probabilistic framework to compensate for photometric noise. Connected components over the resultant image is calculated, which is bounded and then pruned using geometric constraints. Finally a Gabor Feature based SVM is used to classify the presence of text in the candidates. The proposed method was tested with ICDAR 10 dataset and few other images available on the internet. It resulted in a recall and precision metric of 0.72 and 0.88 comfortably better than the benchmark Eiphstein's algorithm. The proposed method recorded a 0.70 and 0.74 in natural images which is significantly better than current methods on natural images. The proposed method also scales almost linearly for high resolution, cluttered images.

  Access Paper or Ask Questions

Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model

Apr 14, 2022
Carina Negreanu, Alperen Karaoglu, Jack Williams, Shuang Chen, Daniel Fabian, Andrew Gordon, Chin-Yew Lin

Row completion is the task of augmenting a given table of text and numbers with additional, relevant rows. The task divides into two steps: subject suggestion, the task of populating the main column; and gap filling, the task of populating the remaining columns. We present state-of-the-art results for subject suggestion and gap filling measured on a standard benchmark (WikiTables). Our idea is to solve this task by harmoniously combining knowledge base table interpretation and free text generation. We interpret the table using the knowledge base to suggest new rows and generate metadata like headers through property linking. To improve candidate diversity, we synthesize additional rows using free text generation via GPT-3, and crucially, we exploit the metadata we interpret to produce better prompts for text generation. Finally, we verify that the additional synthesized content can be linked to the knowledge base or a trusted web source such as Wikipedia.

  Access Paper or Ask Questions

Consistency and Coherence from Points of Contextual Similarity

Dec 22, 2021
Oleg Vasilyev, John Bohannon

Factual consistency is one of important summary evaluation dimensions, especially as summary generation becomes more fluent and coherent. The ESTIME measure, recently proposed specifically for factual consistency, achieves high correlations with human expert scores both for consistency and fluency, while in principle being restricted to evaluating such text-summary pairs that have high dictionary overlap. This is not a problem for current styles of summarization, but it may become an obstacle for future summarization systems, or for evaluating arbitrary claims against the text. In this work we generalize the method, making it applicable to any text-summary pairs. As ESTIME uses points of contextual similarity, it provides insights into usefulness of information taken from different BERT layers. We observe that useful information exists in almost all of the layers except the several lowest ones. For consistency and fluency - qualities focused on local text details - the most useful layers are close to the top (but not at the top); for coherence and relevance we found a more complicated and interesting picture.

* 9 pages, 7 figures, 1 table 

  Access Paper or Ask Questions

Evolutionary Algorithm for Sinhala to English Translation

Jul 06, 2019
J. K. Joseph, W. M. T. Chathurika, A. Nugaliyadde, Y. Mallawarachchi

Machine Translation (MT) is an area in natural language processing, which focus on translating from one language to another. Many approaches ranging from statistical methods to deep learning approaches are used in order to achieve MT. However, these methods either require a large number of data or a clear understanding about the language. Sinhala language has less digital text which could be used to train a deep neural network. Furthermore, Sinhala has complex rules therefore, it is harder to create statistical rules in order to apply statistical methods in MT. This research focuses on Sinhala to English translation using an Evolutionary Algorithm (EA). EA is used to identifying the correct meaning of Sinhala text and to translate it to English. The Sinhala text is passed to identify the meaning in order to get the correct meaning of the sentence. With the use of the EA the translation is carried out. The translated text is passed on to grammatically correct the sentence. This has shown to achieve accurate results.

* The paper was submitted to National Information Technology Conference (2019) 

  Access Paper or Ask Questions

A Study of Feature Extraction techniques for Sentiment Analysis

Jun 04, 2019
Avinash Madasu, Sivasankar E

Sentiment Analysis refers to the study of systematically extracting the meaning of subjective text . When analysing sentiments from the subjective text using Machine Learning techniques,feature extraction becomes a significant part. We perform a study on the performance of feature extraction techniques TF-IDF(Term Frequency-Inverse Document Frequency) and Doc2vec (Document to Vector) using Cornell movie review datasets, UCI sentiment labeled datasets, stanford movie review datasets,effectively classifying the text into positive and negative polarities by using various pre-processing methods like eliminating StopWords and Tokenization which increases the performance of sentiment analysis in terms of accuracy and time taken by the classifier.The features obtained after applying feature extraction techniques on the text sentences are trained and tested using the classifiers Logistic Regression,Support Vector Machines,K-Nearest Neighbours , Decision Tree and Bernoulli Nave Bayes

  Access Paper or Ask Questions

Neural Networks Models for Analyzing Magic: the Gathering Cards

Oct 08, 2018
Felipe Zilio, Marcelo Prates, Luis Lamb

Historically, games of all kinds have often been the subject of study in scientific works of Computer Science, including the field of machine learning. By using machine learning techniques and applying them to a game with defined rules or a structured dataset, it's possible to learn and improve on the already existing techniques and methods to tackle new challenges and solve problems that are out of the ordinary. The already existing work on card games tends to focus on gameplay and card mechanics. This work aims to apply neural networks models, including Convolutional Neural Networks and Recurrent Neural Networks, in order to analyze Magic: the Gathering cards, both in terms of card text and illustrations; the card images and texts are used to train the networks in order to be able to classify them into multiple categories. The ultimate goal was to develop a methodology that could generate card text matching it to an input image, which was attained by relating the prediction values of the images and generated text across the different categories.

* 10 pages, 1 figure, 9 tables. Accepted at ICONIP 2018 

  Access Paper or Ask Questions

Amharic Abstractive Text Summarization

Mar 30, 2020
Amr M. Zaki, Mahmoud I. Khalil, Hazem M. Abbas

Text Summarization is the task of condensing long text into just a handful of sentences. Many approaches have been proposed for this task, some of the very first were building statistical models (Extractive Methods) capable of selecting important words and copying them to the output, however these models lacked the ability to paraphrase sentences, as they simply select important words without actually understanding their contexts nor understanding their meaning, here comes the use of Deep Learning based architectures (Abstractive Methods), which effectively tries to understand the meaning of sentences to build meaningful summaries. In this work we discuss one of these new novel approaches which combines curriculum learning with Deep Learning, this model is called Scheduled Sampling. We apply this work to one of the most widely spoken African languages which is the Amharic Language, as we try to enrich the African NLP community with top-notch Deep Learning architectures.

* content 3 pages, reference 2 pages, 2 figures, presented to AfricaNLP workshop ICLR 2020 

  Access Paper or Ask Questions

Evaluating Style Transfer for Text

Apr 04, 2019
Remi Mir, Bjarke Felbo, Nick Obradovich, Iyad Rahwan

Research in the area of style transfer for text is currently bottlenecked by a lack of standard evaluation practices. This paper aims to alleviate this issue by experimentally identifying best practices with a Yelp sentiment dataset. We specify three aspects of interest (style transfer intensity, content preservation, and naturalness) and show how to obtain more reliable measures of them from human evaluation than in previous work. We propose a set of metrics for automated evaluation and demonstrate that they are more strongly correlated and in agreement with human judgment: direction-corrected Earth Mover's Distance, Word Mover's Distance on style-masked texts, and adversarial classification for the respective aspects. We also show that the three examined models exhibit tradeoffs between aspects of interest, demonstrating the importance of evaluating style transfer models at specific points of their tradeoff plots. We release software with our evaluation metrics to facilitate research.

* To appear in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics 

  Access Paper or Ask Questions

Improving Abstraction in Text Summarization

Aug 23, 2018
Wojciech Kryściński, Romain Paulus, Caiming Xiong, Richard Socher

Abstractive text summarization aims to shorten long text documents into a human readable form that contains the most important facts from the original document. However, the level of actual abstraction as measured by novel phrases that do not appear in the source document remains low in existing approaches. We propose two techniques to improve the level of abstraction of generated summaries. First, we decompose the decoder into a contextual network that retrieves relevant parts of the source document, and a pretrained language model that incorporates prior knowledge about language generation. Second, we propose a novelty metric that is optimized directly through policy learning to encourage the generation of novel phrases. Our model achieves results comparable to state-of-the-art models, as determined by ROUGE scores and human evaluations, while achieving a significantly higher level of abstraction as measured by n-gram overlap with the source document.

  Access Paper or Ask Questions