Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Cost-effective Selection of Pretraining Data: A Case Study of Pretraining BERT on Social Media

Oct 02, 2020
Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris

Recent studies on domain-specific BERT models show that effectiveness on downstream tasks can be improved when models are pretrained on in-domain data. Often, the pretraining data used in these models are selected based on their subject matter, e.g., biology or computer science. Given the range of applications using social media text, and its unique language variety, we pretrain two models on tweets and forum text respectively, and empirically demonstrate the effectiveness of these two resources. In addition, we investigate how similarity measures can be used to nominate in-domain pretraining data. We publicly release our pretrained models at

* Findings of EMNLP 2020 

  Access Paper or Ask Questions

A Joint Model for Multimodal Document Quality Assessment

Jan 14, 2019
Aili Shen, Bahar Salehi, Timothy Baldwin, Jianzhong Qi

The quality of a document is affected by various factors, including grammaticality, readability, stylistics, and expertise depth, making the task of document quality assessment a complex one. In this paper, we explore this task in the context of assessing the quality of Wikipedia articles and academic papers. Observing that the visual rendering of a document can capture implicit quality indicators that are not present in the document text --- such as images, font choices, and visual layout --- we propose a joint model that combines the text content with a visual rendering of the document for document quality assessment. Experimental results over two datasets reveal that textual and visual features are complementary, achieving state-of-the-art results.

  Access Paper or Ask Questions

Simulating Action Dynamics with Neural Process Networks

May 15, 2018
Antoine Bosselut, Omer Levy, Ari Holtzman, Corin Ennis, Dieter Fox, Yejin Choi

Understanding procedural language requires anticipating the causal effects of actions, even when they are not explicitly stated. In this work, we introduce Neural Process Networks to understand procedural text through (neural) simulation of action dynamics. Our model complements existing memory architectures with dynamic entity tracking by explicitly modeling actions as state transformers. The model updates the states of the entities by executing learned action operators. Empirical results demonstrate that our proposed model can reason about the unstated causal effects of actions, allowing it to provide more accurate contextual information for understanding and generating procedural text, all while offering more interpretable internal representations than existing alternatives.

  Access Paper or Ask Questions

Language Generation with Recurrent Generative Adversarial Networks without Pre-training

Dec 21, 2017
Ofir Press, Amir Bar, Ben Bogin, Jonathan Berant, Lior Wolf

Generative Adversarial Networks (GANs) have shown great promise recently in image generation. Training GANs for language generation has proven to be more difficult, because of the non-differentiable nature of generating text with recurrent neural networks. Consequently, past work has either resorted to pre-training with maximum-likelihood or used convolutional networks for generation. In this work, we show that recurrent neural networks can be trained to generate text with GANs from scratch using curriculum learning, by slowly teaching the model to generate sequences of increasing and variable length. We empirically show that our approach vastly improves the quality of generated sequences compared to a convolutional baseline.

* Presented at the 1st Workshop on Learning to Generate Natural Language at ICML 2017 

  Access Paper or Ask Questions

Information Extraction with Character-level Neural Networks and Free Noisy Supervision

Jan 24, 2017
Philipp Meerkamp, Zhengyi Zhou

We present an architecture for information extraction from text that augments an existing parser with a character-level neural network. The network is trained using a measure of consistency of extracted data with existing databases as a form of noisy supervision. Our architecture combines the ability of constraint-based information extraction systems to easily incorporate domain knowledge and constraints with the ability of deep neural networks to leverage large amounts of data to learn complex features. Boosting the existing parser's precision, the system led to large improvements over a mature and highly tuned constraint-based production information extraction system used at Bloomberg for financial language text.

  Access Paper or Ask Questions

SpeedRead: A Fast Named Entity Recognition Pipeline

Jan 14, 2013
Rami Al-Rfou', Steven Skiena

Online content analysis employs algorithmic methods to identify entities in unstructured text. Both machine learning and knowledge-base approaches lie at the foundation of contemporary named entities extraction systems. However, the progress in deploying these approaches on web-scale has been been hampered by the computational cost of NLP over massive text corpora. We present SpeedRead (SR), a named entity recognition pipeline that runs at least 10 times faster than Stanford NLP pipeline. This pipeline consists of a high performance Penn Treebank- compliant tokenizer, close to state-of-art part-of-speech (POS) tagger and knowledge-based named entity recognizer.

* Long paper at COLING 2012 

  Access Paper or Ask Questions

Autogenic Training With Natural Language Processing Modules: A Recent Tool For Certain Neuro Cognitive Studies

Jul 02, 2004
S. Ravichandran, M. N. Karthik

Learning to respond to voice-text input involves the subject's ability in understanding the phonetic and text based contents and his/her ability to communicate based on his/her experience. The neuro-cognitive facility of the subject has to support two important domains in order to make the learning process complete. In many cases, though the understanding is complete, the response is partial. This is one valid reason why we need to support the information from the subject with scalable techniques such as Natural Language Processing (NLP) for abstraction of the contents from the output. This paper explores the feasibility of using NLP modules interlaced with Neural Networks to perform the required task in autogenic training related to medical applications.

* 2 Pages. Proceedings of 11th International Congress on Biological & Medical Engineering, Singapore (IEEE-EMBS & IFMBE endorsed) 

  Access Paper or Ask Questions

Explaining Classes through Word Attribution

Aug 31, 2021
Samuel Rönnqvist, Amanda Myntti, Aki-Juhani Kyröläinen, Sampo Pyysalo, Veronika Laippala, Filip Ginter

In recent years, several methods have been proposed for explaining individual predictions of deep learning models, yet there has been little study of how to aggregate these predictions to explain how such models view classes as a whole in text classification tasks. In this work, we propose a method for explaining classes using deep learning models and the Integrated Gradients feature attribution technique by aggregating explanations of individual examples in text classification to general descriptions of the classes. We demonstrate the approach on Web register (genre) classification using the XML-R model and the Corpus of Online Registers of English (CORE), finding that the method identifies plausible and discriminative keywords characterizing all but the smallest class.

  Access Paper or Ask Questions

Will Multi-modal Data Improves Few-shot Learning?

Jul 25, 2021
Zilun Zhang, Shihao Ma, Yichun Zhang

Most few-shot learning models utilize only one modality of data. We would like to investigate qualitatively and quantitatively how much will the model improve if we add an extra modality (i.e. text description of the image), and how it affects the learning procedure. To achieve this goal, we propose four types of fusion method to combine the image feature and text feature. To verify the effectiveness of improvement, we test the fusion methods with two classical few-shot learning models - ProtoNet and MAML, with image feature extractors such as ConvNet and ResNet12. The attention-based fusion method works best, which improves the classification accuracy by a large margin around 30% comparing to the baseline result.

* Project Report 

  Access Paper or Ask Questions

An Analysis of the Recent Visibility of the SigDial Conference

Jun 30, 2021
Casey Kennington, McKenzie Steenson

Automated speech and text interfaces are continuing to improve, resulting in increased research in the area of dialogue systems. Moreover, conferences and workshops from various fields are focusing more on language through speech and text mediums as candidates for interaction with applications such as search interfaces and robots. In this paper, we explore how visible the SigDial conference is to outside conferences by analysing papers from top Natural Langauge Processing conferences since 2015 to determine the popularity of certain SigDial-related topics, as well as analysing what SigDial papers are being cited by others outside of SigDial. We find that despite a dramatic increase in dialogue-related research, SigDial visibility has not increased. We conclude by offering some suggestions.

  Access Paper or Ask Questions