Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification

Mar 03, 2018
Yizhong Wang, Sujian Li, Jingfeng Yang, Xu Sun, Houfeng Wang

Identifying implicit discourse relations between text spans is a challenging task because it requires understanding the meaning of the text. To tackle this task, recent studies have tried several deep learning methods but few of them exploited the syntactic information. In this work, we explore the idea of incorporating syntactic parse tree into neural networks. Specifically, we employ the Tree-LSTM model and Tree-GRU model, which are based on the tree structure, to encode the arguments in a relation. Moreover, we further leverage the constituent tags to control the semantic composition process in these tree-structured neural networks. Experimental results show that our method achieves state-of-the-art performance on PDTB corpus.

* Accepted by IJCNLP 2017, 10 pages 

  Access Paper or Ask Questions

Updating Singular Value Decomposition for Rank One Matrix Perturbation

Jul 26, 2017
Ratnik Gandhi, Amoli Rajgor

An efficient Singular Value Decomposition (SVD) algorithm is an important tool for distributed and streaming computation in big data problems. It is observed that update of singular vectors of a rank-1 perturbed matrix is similar to a Cauchy matrix-vector product. With this observation, in this paper, we present an efficient method for updating Singular Value Decomposition of rank-1 perturbed matrix in $O(n^2 \ \text{log}(\frac{1}{\epsilon}))$ time. The method uses Fast Multipole Method (FMM) for updating singular vectors in $O(n \ \text{log} (\frac{1}{\epsilon}))$ time, where $\epsilon$ is the precision of computation.

  Access Paper or Ask Questions

NeuroNER: an easy-to-use program for named-entity recognition based on neural networks

May 16, 2017
Franck Dernoncourt, Ji Young Lee, Peter Szolovits

Named-entity recognition (NER) aims at identifying entities of interest in a text. Artificial neural networks (ANNs) have recently been shown to outperform existing NER systems. However, ANNs remain challenging to use for non-expert users. In this paper, we present NeuroNER, an easy-to-use named-entity recognition tool based on ANNs. Users can annotate entities using a graphical web-based user interface (BRAT): the annotations are then used to train an ANN, which in turn predict entities' locations and categories in new texts. NeuroNER makes this annotation-training-prediction flow smooth and accessible to anyone.

* The first two authors contributed equally to this work 

  Access Paper or Ask Questions

Numerically Grounded Language Models for Semantic Error Correction

Aug 14, 2016
Georgios P. Spithourakis, Isabelle Augenstein, Sebastian Riedel

Semantic error detection and correction is an important task for applications such as fact checking, speech-to-text or grammatical error correction. Current approaches generally focus on relatively shallow semantics and do not account for numeric quantities. Our approach uses language models grounded in numbers within the text. Such groundings are easily achieved for recurrent neural language model architectures, which can be further conditioned on incomplete background knowledge bases. Our evaluation on clinical reports shows that numerical grounding improves perplexity by 33% and F1 for semantic error correction by 5 points when compared to ungrounded approaches. Conditioning on a knowledge base yields further improvements.

* accepted to EMNLP 2016 

  Access Paper or Ask Questions

LMdiff: A Visual Diff Tool to Compare Language Models

Nov 02, 2021
Hendrik Strobelt, Benjamin Hoover, Arvind Satyanarayan, Sebastian Gehrmann

While different language models are ubiquitous in NLP, it is hard to contrast their outputs and identify which contexts one can handle better than the other. To address this question, we introduce LMdiff, a tool that visually compares probability distributions of two models that differ, e.g., through finetuning, distillation, or simply training with different parameter sizes. LMdiff allows the generation of hypotheses about model behavior by investigating text instances token by token and further assists in choosing these interesting text instances by identifying the most interesting phrases from large corpora. We showcase the applicability of LMdiff for hypothesis generation across multiple case studies. A demo is available at .

* EMNLP 2021 Demo Paper 

  Access Paper or Ask Questions

DISCO : efficient unsupervised decoding for discrete natural language problems via convex relaxation

Jul 13, 2021
Anish Acharya, Rudrajit Das

In this paper we study test time decoding; an ubiquitous step in almost all sequential text generation task spanning across a wide array of natural language processing (NLP) problems. Our main contribution is to develop a continuous relaxation framework for the combinatorial NP-hard decoding problem and propose Disco - an efficient algorithm based on standard first order gradient based. We provide tight analysis and show that our proposed algorithm linearly converges to within $\epsilon$ neighborhood of the optima. Finally, we perform preliminary experiments on the task of adversarial text generation and show superior performance of Disco over several popular decoding approaches.

  Access Paper or Ask Questions

Hybrid approach to detecting symptoms of depression in social media entries

Jun 19, 2021
Agnieszka Wołk, Karol Chlasta, Paweł Holas

Sentiment and lexical analyses are widely used to detect depression or anxiety disorders. It has been documented that there are significant differences in the language used by a person with emotional disorders in comparison to a healthy individual. Still, the effectiveness of these lexical approaches could be improved further because the current analysis focuses on what the social media entries are about, and not how they are written. In this study, we focus on aspects in which these short texts are similar to each other, and how they were created. We present an innovative approach to the depression screening problem by applying Collgram analysis, which is a known effective method of obtaining linguistic information from texts. We compare these results with sentiment analysis based on the BERT architecture. Finally, we create a hybrid model achieving a diagnostic accuracy of 71%.

* 11 pages, 4 figures, 2 tables, The Pacific Asia Conference on Information Systems (PACIS2021) 

  Access Paper or Ask Questions

Relation Clustering in Narrative Knowledge Graphs

Nov 27, 2020
Simone Mellace, K Vani, Alessandro Antonucci

When coping with literary texts such as novels or short stories, the extraction of structured information in the form of a knowledge graph might be hindered by the huge number of possible relations between the entities corresponding to the characters in the novel and the consequent hurdles in gathering supervised information about them. Such issue is addressed here as an unsupervised task empowered by transformers: relational sentences in the original text are embedded (with SBERT) and clustered in order to merge together semantically similar relations. All the sentences in the same cluster are finally summarized (with BART) and a descriptive label extracted from the summary. Preliminary tests show that such clustering might successfully detect similar relations, and provide a valuable preprocessing for semi-supervised approaches.

* Accepted for AI4Narratives Workshop at 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence 

  Access Paper or Ask Questions

On the importance of pre-training data volume for compact language models

Oct 09, 2020
Vincent Micheli, Martin d'Hoffschmidt, François Fleuret

Recent advances in language modeling have led to computationally intensive and resource-demanding state-of-the-art models. In an effort towards sustainable practices, we study the impact of pre-training data volume on compact language models. Multiple BERT-based models are trained on gradually increasing amounts of French text. Through fine-tuning on the French Question Answering Dataset (FQuAD), we observe that well-performing models are obtained with as little as 100 MB of text. In addition, we show that past critically low amounts of pre-training data, an intermediate pre-training step on the task-specific corpus does not yield substantial improvements.

* EMNLP 2020; typo corrected 

  Access Paper or Ask Questions