Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

From Algebraic Word Problem to Program: A Formalized Approach

Mar 11, 2020
Adam Wiemerslage, Shafiuddin Rehan Ahmed

In this paper, we propose a pipeline to convert grade school level algebraic word problem into program of a formal languageA-IMP. Using natural language processing tools, we break the problem into sentence fragments which can then be reduced to functions. The functions are categorized by the head verb of the sentence and its structure, as defined by (Hosseini et al., 2014). We define the function signature and extract its arguments from the text using dependency parsing. We have a working implementation of the entire pipeline which can be found on our github repository.

* 9 pages, 6 figures, Course project of Programming Languages 

  Access Paper or Ask Questions

Continuous Silent Speech Recognition using EEG

Mar 08, 2020
Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik

In this paper we explore continuous silent speech recognition using electroencephalography (EEG) signals. We implemented a connectionist temporal classification (CTC) automatic speech recognition (ASR) model to translate EEG signals recorded in parallel while subjects were reading English sentences in their mind without producing any voice to text. Our results demonstrate the feasibility of using EEG signals for performing continuous silent speech recognition. We demonstrate our results for a limited English vocabulary consisting of 30 unique sentences.

  Access Paper or Ask Questions

SumQE: a BERT-based Summary Quality Estimation Model

Sep 02, 2019
Stratos Xenouleas, Prodromos Malakasiotis, Marianna Apidianaki, Ion Androutsopoulos

We propose SumQE, a novel Quality Estimation model for summarization based on BERT. The model addresses linguistic quality aspects that are only indirectly captured by content-based approaches to summary evaluation, without involving comparison with human references. SumQE achieves very high correlations with human ratings, outperforming simpler models addressing these linguistic aspects. Predictions of the SumQE model can be used for system development, and to inform users of the quality of automatically produced summaries and other types of generated text.

* In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019), Hong Kong, China, 2019 

  Access Paper or Ask Questions

Improving Generalization in Coreference Resolution via Adversarial Training

Aug 13, 2019
Sanjay Subramanian, Dan Roth

In order for coreference resolution systems to be useful in practice, they must be able to generalize to new text. In this work, we demonstrate that the performance of the state-of-the-art system decreases when the names of PER and GPE named entities in the CoNLL dataset are changed to names that do not occur in the training set. We use the technique of adversarial gradient-based training to retrain the state-of-the-art system and demonstrate that the retrained system achieves higher performance on the CoNLL dataset (both with and without the change of named entities) and the GAP dataset.

* *SEM 2019 

  Access Paper or Ask Questions

CS563-QA: A Collection for Evaluating Question Answering Systems

Jul 02, 2019
Katerina Papantoniou, Yannis Tzitzikas

Question Answering (QA) is a challenging topic since it requires tackling the various difficulties of natural language understanding. Since evaluation is important not only for identifying the strong and weak points of the various techniques for QA, but also for facilitating the inception of new methods and techniques, in this paper we present a collection for evaluating QA methods over free text that we have created. Although it is a small collection, it contains cases of increasing difficulty, therefore it has an educational value and it can be used for rapid evaluation of QA systems.

* 11 pages 

  Access Paper or Ask Questions

Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

Jan 06, 2019
Yunsu Kim, Julian Schamper, Hermann Ney

We address for the first time unsupervised training for a translation task with hundreds of thousands of vocabulary words. We scale up the expectation-maximization (EM) algorithm to learn a large translation table without any parallel text or seed lexicon. First, we solve the memory bottleneck and enforce the sparsity with a simple thresholding scheme for the lexicon. Second, we initialize the lexicon training with word classes, which efficiently boosts the performance. Our methods produced promising results on two large-scale unsupervised translation tasks.

* Published in EACL 2017 

  Access Paper or Ask Questions

A Polynomial Time MCMC Method for Sampling from Continuous DPPs

Oct 20, 2018
Shayan Oveis Gharan, Alireza Rezaei

We study the Gibbs sampling algorithm for continuous determinantal point processes. We show that, given a warm start, the Gibbs sampler generates a random sample from a continuous $k$-DPP defined on a $d$-dimensional domain by only taking $\text{poly}(k)$ number of steps. As an application, we design an algorithm to generate random samples from $k$-DPPs defined by a spherical Gaussian kernel on a unit sphere in $d$-dimensions, $\mathbb{S}^{d-1}$ in time polynomial in $k,d$.

  Access Paper or Ask Questions

Learning Visually Grounded Sentence Representations

Jun 04, 2018
Douwe Kiela, Alexis Conneau, Allan Jabri, Maximilian Nickel

We introduce a variety of models, trained on a supervised image captioning corpus to predict the image features for a given caption, to perform sentence representation grounding. We train a grounded sentence encoder that achieves good performance on COCO caption and image retrieval and subsequently show that this encoder can successfully be transferred to various NLP tasks, with improved performance over text-only models. Lastly, we analyze the contribution of grounding, and show that word embeddings learned by this system outperform non-grounded ones.

* Published at NAACL-18 

  Access Paper or Ask Questions