Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

On the Impact of Knowledge-based Linguistic Annotations in the Quality of Scientific Embeddings

Apr 13, 2021
Andres Garcia-Silva, Ronald Denaux, Jose Manuel Gomez-Perez

In essence, embedding algorithms work by optimizing the distance between a word and its usual context in order to generate an embedding space that encodes the distributional representation of words. In addition to single words or word pieces, other features which result from the linguistic analysis of text, including lexical, grammatical and semantic information, can be used to improve the quality of embedding spaces. However, until now we did not have a precise understanding of the impact that such individual annotations and their possible combinations may have in the quality of the embeddings. In this paper, we conduct a comprehensive study on the use of explicit linguistic annotations to generate embeddings from a scientific corpus and quantify their impact in the resulting representations. Our results show how the effect of such annotations in the embeddings varies depending on the evaluation task. In general, we observe that learning embeddings using linguistic annotations contributes to achieve better evaluation results.

* Accepted for publication in Future Generation Computer Systems 

  Access Paper or Ask Questions

CLEVR_HYP: A Challenge Dataset and Baselines for Visual Question Answering with Hypothetical Actions over Images

Apr 13, 2021
Shailaja Keyur Sampat, Akshay Kumar, Yezhou Yang, Chitta Baral

Most existing research on visual question answering (VQA) is limited to information explicitly present in an image or a video. In this paper, we take visual understanding to a higher level where systems are challenged to answer questions that involve mentally simulating the hypothetical consequences of performing specific actions in a given scenario. Towards that end, we formulate a vision-language question answering task based on the CLEVR (Johnson et. al., 2017) dataset. We then modify the best existing VQA methods and propose baseline solvers for this task. Finally, we motivate the development of better vision-language models by providing insights about the capability of diverse architectures to perform joint reasoning over image-text modality. Our dataset setup scripts and codes will be made publicly available at

* 16 pages, 11 figures, Accepted as a Long Paper at NAACL-HLT 2021 

  Access Paper or Ask Questions

A Question-answering Based Framework for Relation Extraction Validation

Apr 07, 2021
Jiayang Cheng, Haiyun Jiang, Deqing Yang, Yanghua Xiao

Relation extraction is an important task in knowledge acquisition and text understanding. Existing works mainly focus on improving relation extraction by extracting effective features or designing reasonable model structures. However, few works have focused on how to validate and correct the results generated by the existing relation extraction models. We argue that validation is an important and promising direction to further improve the performance of relation extraction. In this paper, we explore the possibility of using question answering as validation. Specifically, we propose a novel question-answering based framework to validate the results from relation extraction models. Our proposed framework can be easily applied to existing relation classifiers without any additional information. We conduct extensive experiments on the popular NYT dataset to evaluate the proposed framework, and observe consistent improvements over five strong baselines.

  Access Paper or Ask Questions

TREC 2020 Podcasts Track Overview

Mar 29, 2021
Rosie Jones, Ben Carterette, Ann Clifton, Maria Eskevich, Gareth J. F. Jones, Jussi Karlgren, Aasish Pappu, Sravana Reddy, Yongze Yu

The Podcast Track is new at the Text Retrieval Conference (TREC) in 2020. The podcast track was designed to encourage research into podcasts in the information retrieval and NLP research communities. The track consisted of two shared tasks: segment retrieval and summarization, both based on a dataset of over 100,000 podcast episodes (metadata, audio, and automatic transcripts) which was released concurrently with the track. The track generated considerable interest, attracted hundreds of new registrations to TREC and fifteen teams, mostly disjoint between search and summarization, made final submissions for assessment. Deep learning was the dominant experimental approach for both search experiments and summarization. This paper gives an overview of the tasks and the results of the participants' experiments. The track will return to TREC 2021 with the same two tasks, incorporating slight modifications in response to participant feedback.

* The Proceedings of the Twenty-Ninth Text REtrieval Conference Proceedings (TREC 2020) 

  Access Paper or Ask Questions

SocialNLP EmotionGIF 2020 Challenge Overview: Predicting Reaction GIF Categories on Social Media

Feb 24, 2021
Boaz Shmueli, Lun-Wei Ku, Soumya Ray

We present an overview of the EmotionGIF2020 Challenge, held at the 8th International Workshop on Natural Language Processing for Social Media (SocialNLP), in conjunction with ACL 2020. The challenge required predicting affective reactions to online texts, and included the EmotionGIF dataset, with tweets labeled for the reaction categories. The novel dataset included 40K tweets with their reaction GIFs. Due to the special circumstances of year 2020, two rounds of the competition were conducted. A total of 84 teams registered for the task. Of these, 25 teams success-fully submitted entries to the evaluation phase in the first round, while 13 teams participated successfully in the second round. Of the top participants, five teams presented a technical report and shared their code. The top score of the winning team using the [email protected] metric was 62.47%.

* The 8th International Workshop on Natural Language Processing for Social Media co-located with ACL-2020. 7 pages, 5 figures, 3 tables 

  Access Paper or Ask Questions

I Want This Product but Different : Multimodal Retrieval with Synthetic Query Expansion

Feb 17, 2021
Ivona Tautkute, Tomasz Trzcinski

This paper addresses the problem of media retrieval using a multimodal query (a query which combines visual input with additional semantic information in natural language feedback). We propose a SynthTriplet GAN framework which resolves this task by expanding the multimodal query with a synthetically generated image that captures semantic information from both image and text input. We introduce a novel triplet mining method that uses a synthetic image as an anchor to directly optimize for embedding distances of generated and target images. We demonstrate that apart from the added value of retrieval illustration with synthetic image with the focus on customization and user feedback, the proposed method greatly surpasses other multimodal generation methods and achieves state of the art results in the multimodal retrieval task. We also show that in contrast to other retrieval methods, our method provides explainable embeddings.

* Under review 

  Access Paper or Ask Questions

SPAN: a Simple Predict & Align Network for Handwritten Paragraph Recognition

Feb 17, 2021
Denis Coquenet, Clément Chatelain, Thierry Paquet

Unconstrained handwriting recognition is an essential task in document analysis. It is usually carried out in two steps. First, the document is segmented into text lines. Second, an Optical Character Recognition model is applied on these line images. We propose the Simple Predict & Align Network: an end-to-end recurrence-free Fully Convolutional Network performing OCR at paragraph level without any prior segmentation stage. The framework is as simple as the one used for the recognition of isolated lines and we achieve competitive results on three popular datasets: RIMES, IAM and READ 2016. The proposed model does not require any dataset adaptation, it can be trained from scratch, without segmentation labels, and it does not require line breaks in the transcription labels. Our code and trained model weights are available at

  Access Paper or Ask Questions

On Data-Augmentation and Consistency-Based Semi-Supervised Learning

Jan 18, 2021
Atin Ghosh, Alexandre H. Thiery

Recently proposed consistency-based Semi-Supervised Learning (SSL) methods such as the $\Pi$-model, temporal ensembling, the mean teacher, or the virtual adversarial training, have advanced the state of the art in several SSL tasks. These methods can typically reach performances that are comparable to their fully supervised counterparts while using only a fraction of labelled examples. Despite these methodological advances, the understanding of these methods is still relatively limited. In this text, we analyse (variations of) the $\Pi$-model in settings where analytically tractable results can be obtained. We establish links with Manifold Tangent Classifiers and demonstrate that the quality of the perturbations is key to obtaining reasonable SSL performances. Importantly, we propose a simple extension of the Hidden Manifold Model that naturally incorporates data-augmentation schemes and offers a framework for understanding and experimenting with SSL methods.

* ICLR 2021 

  Access Paper or Ask Questions

EstBERT: A Pretrained Language-Specific BERT for Estonian

Nov 09, 2020
Hasan Tanvir, Claudia Kittask, Kairit Sirts

This paper presents EstBERT, a large pretrained transformer-based language-specific BERT model for Estonian. Recent work has evaluated multilingual BERT models on Estonian tasks and found them to outperform the baselines. Still, based on existing studies on other languages, a language-specific BERT model is expected to improve over the multilingual ones. We first describe the EstBERT pretraining process and then present the results of the models based on finetuned EstBERT for multiple NLP tasks, including POS and morphological tagging, named entity recognition and text classification. The evaluation results show that the models based on EstBERT outperform multilingual BERT models on five tasks out of six, providing further evidence towards a view that training language-specific BERT models are still useful, even when multilingual models are available.

  Access Paper or Ask Questions

Deep Diacritization: Efficient Hierarchical Recurrence for Improved Arabic Diacritization

Nov 01, 2020
Badr AlKhamissi, Muhammad N. ElNokrashy, Mohamed Gabr

We propose a novel architecture for labelling character sequences that achieves state-of-the-art results on the Tashkeela Arabic diacritization benchmark. The core is a two-level recurrence hierarchy that operates on the word and character levels separately---enabling faster training and inference than comparable traditional models. A cross-level attention module further connects the two, and opens the door for network interpretability. The task module is a softmax classifier that enumerates valid combinations of diacritics. This architecture can be extended with a recurrent decoder that optionally accepts priors from partially diacritized text, which improves results. We employ extra tricks such as sentence dropout and majority voting to further boost the final result. Our best model achieves a WER of 5.34%, outperforming the previous state-of-the-art with a 30.56% relative error reduction.

* This work was accepted at the Fifth Arabic Natural Language Processing Workshop (COLING/WANLP 2020) 

  Access Paper or Ask Questions