Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery

Feb 10, 2020
Hakime Öztürk, Arzucan Özgür, Philippe Schwaller, Teodoro Laino, Elif Ozkirimli

Text-based representations of chemicals and proteins can be thought of as unstructured languages codified by humans to describe domain-specific knowledge. Advances in natural language processing (NLP) methodologies in the processing of spoken languages accelerated the application of NLP to elucidate hidden knowledge in textual representations of these biochemical entities and then use it to construct models to predict molecular properties or to design novel molecules. This review outlines the impact made by these advances on drug discovery and aims to further the dialogue between medicinal chemists and computer scientists.

  Access Paper or Ask Questions

BSDAR: Beam Search Decoding with Attention Reward in Neural Keyphrase Generation

Sep 17, 2019
Iftitahu Ni'mah, Vlado Menkovski, Mykola Pechenizkiy

This study mainly investigates two decoding problems in neural keyphrase generation: sequence length bias and beam diversity. We introduce an extension of beam search inference based on word-level and n-gram level attention score to adjust and constrain Seq2Seq prediction at test time. Results show that our proposed solution can overcome the algorithm bias to shorter and nearly identical sequences, resulting in a significant improvement of the decoding performance on generating keyphrases that are present and absent in source text.

  Access Paper or Ask Questions

Feature-Less End-to-End Nested Term Extraction

Aug 15, 2019
Yuze Gao, Yu Yuan

In this paper, we proposed a deep learning-based end-to-end method on the domain specified automatic term extraction (ATE), it considers possible term spans within a fixed length in the sentence and predicts them whether they can be conceptual terms. In comparison with current ATE methods, the model supports nested term extraction and does not crucially need extra (extracted) features. Results show that it can achieve high recall and a comparable precision on term extraction task with inputting segmented raw text.

* NLPCC Workshop on Explainable Artificial Intelligence 2019 

  Access Paper or Ask Questions

JUMT at WMT2019 News Translation Task: A Hybrid approach to Machine Translation for Lithuanian to English

Aug 01, 2019
Sainik Kumar Mahata, Avishek Garain, Adityar Rayala, Dipankar Das, Sivaji Bandyopadhyay

In the current work, we present a description of the system submitted to WMT 2019 News Translation Shared task. The system was created to translate news text from Lithuanian to English. To accomplish the given task, our system used a Word Embedding based Neural Machine Translation model to post edit the outputs generated by a Statistical Machine Translation model. The current paper documents the architecture of our model, descriptions of the various modules and the results produced using the same. Our system garnered a BLEU score of 17.6.

* arXiv admin note: substantial text overlap with arXiv:1908.00323 

  Access Paper or Ask Questions

Exploring difference in public perceptions on HPV vaccine between gender groups from Twitter using deep learning

Jul 06, 2019
Jingcheng Du, Chongliang Luo, Qiang Wei, Yong Chen, Cui Tao

In this study, we proposed a convolutional neural network model for gender prediction using English Twitter text as input. Ensemble of proposed model achieved an accuracy at 0.8237 on gender prediction and compared favorably with the state-of-the-art performance in a recent author profiling task. We further leveraged the trained models to predict the gender labels from an HPV vaccine related corpus and identified gender difference in public perceptions regarding HPV vaccine. The findings are largely consistent with previous survey-based studies.

* This manuscript has been accepted by 2019 KDD Workshop on Applied Data Science for Healthcare 

  Access Paper or Ask Questions

Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning

May 31, 2019
Tahira Naseem, Abhishek Shah, Hui Wan, Radu Florian, Salim Roukos, Miguel Ballesteros

Our work involves enriching the Stack-LSTM transition-based AMR parser (Ballesteros and Al-Onaizan, 2017) by augmenting training with Policy Learning and rewarding the Smatch score of sampled graphs. In addition, we also combined several AMR-to-text alignments with an attention mechanism and we supplemented the parser with pre-processed concept identification, named entities and contextualized embeddings. We achieve a highly competitive performance that is comparable to the best published results. We show an in-depth study ablating each of the new components of the parser

* Accepted as short paper at ACL 2019 

  Access Paper or Ask Questions

M-GWAP: An Online and Multimodal Game With A Purpose in WordPress for Mental States Annotation

May 30, 2019
Fabio Paolizzo

M-GWAP is a multimodal game with a purpose of that leverages on the wisdom of crowds phenomenon for the annotation of multimedia data in terms of mental states. This game with a purpose is developed in WordPress to allow users implementing the game without programming skills. The game adopts motivational strategies for the player to remain engaged, such as a score system, text motivators while playing, a ranking system to foster competition and mechanics for identify building. The current version of the game was deployed after alpha and beta testing helped refining the game accordingly.

* 2 figures, 4 tables. The research is supported by the EU through the MUSICAL-MOODS project funded by the Marie Sklodowska-Curie Actions Individual Fellowships Global Fellowships (MSCA-IF-GF) of the Horizon 2020 Programme H2020/2014-2020, REA grant agreement n.659434 

  Access Paper or Ask Questions

Improving Context Modelling in Multimodal Dialogue Generation

Oct 20, 2018
Shubham Agarwal, Ondrej Dusek, Ioannis Konstas, Verena Rieser

In this work, we investigate the task of textual response generation in a multimodal task-oriented dialogue system. Our work is based on the recently released Multimodal Dialogue (MMD) dataset (Saha et al., 2017) in the fashion domain. We introduce a multimodal extension to the Hierarchical Recurrent Encoder-Decoder (HRED) model and show that this extension outperforms strong baselines in terms of text-based similarity metrics. We also showcase the shortcomings of current vision and language models by performing an error analysis on our system's output.

  Access Paper or Ask Questions

NeuralREG: An end-to-end approach to referring expression generation

May 21, 2018
Thiago Castro Ferreira, Diego Moussallem, Ákos Kádár, Sander Wubben, Emiel Krahmer

Traditionally, Referring Expression Generation (REG) models first decide on the form and then on the content of references to discourse entities in text, typically relying on features such as salience and grammatical function. In this paper, we present a new approach (NeuralREG), relying on deep neural networks, which makes decisions about form and content in one go without explicit feature extraction. Using a delexicalized version of the WebNLG corpus, we show that the neural model substantially improves over two strong baselines. Data and models are publicly available.

* Accepted for presentation at ACL 2018 

  Access Paper or Ask Questions

Email Classification into Relevant Category Using Neural Networks

Feb 12, 2018
Deepak Kumar Gupta, Shruti Goyal

In the real world, many online shopping websites or service provider have single email-id where customers can send their query, concern etc. At the back-end service provider receive million of emails every week, how they can identify which email is belonged of a particular department? This paper presents an artificial neural network (ANN) model that is used to solve this problem and experiments are carried out on user personal Gmail emails datasets. This problem can be generalised as typical Text Classification or Categorization.

* 9 Pages, 6 figures, 2 Tables 

  Access Paper or Ask Questions