Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Countering Language Drift with Seeded Iterated Learning

Apr 06, 2020
Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville

Supervised learning methods excel at capturing statistical properties of language when trained over large text corpora. Yet, these models often produce inconsistent outputs in goal-oriented language settings as they are not trained to complete the underlying task. Moreover, as soon as the agents are finetuned to maximize task completion, they suffer from the so-called language drift phenomenon: they slowly lose syntactic and semantic properties of language as they only focus on solving the task. In this paper, we propose a generic approach to counter language drift by using iterated learning. We iterate between fine-tuning agents with interactive training steps, and periodically replacing them with new agents that are seeded from last iteration and trained to imitate the latest finetuned models. Iterated learning does not require external syntactic constraint nor semantic knowledge, making it a valuable task-agnostic finetuning protocol. We first explore iterated learning in the Lewis Game. We then scale-up the approach in the translation game. In both settings, our results show that iterated learn-ing drastically counters language drift as well as it improves the task completion metric.

  Access Paper or Ask Questions

Automatic Extraction of Bengali Root Verbs using Paninian Grammar

Mar 31, 2020
Arijit Das, Tapas Halder, Diganta Saha

In this research work, we have proposed an algorithm based on supervised learning methodology to extract the root forms of the Bengali verbs using the grammatical rules proposed by Panini [1] in Ashtadhyayi. This methodology can be applied for the languages which are derived from Sanskrit. The proposed system has been developed based on tense, person and morphological inflections of the verbs to find their root forms. The work has been executed in two phases: first, the surface level forms or inflected forms of the verbs have been classified into a certain number of groups of similar tense and person. For this task, a standard pattern, available in Bengali language has been used. Next, a set of rules have been applied to extract the root form from the surface level forms of a verb. The system has been tested on 10000 verbs collected from the Bengali text corpus developed in the TDIL project of the Govt. of India. The accuracy of the output has been achieved 98% which is verified by a linguistic expert. Root verb identification is a key step in semantic searching, multi-sentence search query processing, understanding the meaning of a language, disambiguation of word sense, classification of the sentences etc.

* published in 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, 2017 

  Access Paper or Ask Questions

Annotating and normalizing biomedical NEs with limited knowledge

Dec 19, 2019
Fernando Sánchez León, Ana González Ledesma

Named entity recognition (NER) is the very first step in the linguistic processing of any new domain. It is currently a common process in BioNLP on English clinical text. However, it is still in its infancy in other major languages, as it is the case for Spanish. Presented under the umbrella of the PharmaCoNER shared task, this paper describes a very simple method for the annotation and normalization of pharmacological, chemical and, ultimately, biomedical named entities in clinical cases. The system developed for the shared task is based on limited knowledge, collected, structured and munged in a way that clearly outperforms scores obtained by similar dictionary-based systems for English in the past. Along with this recovering of the knowledge-based methods for NER in subdomains, the paper also highlights the key contribution of resource-based systems in the validation and consolidation of both the annotation guidelines and the human annotation practices. In this sense, some of the authors discoverings on the overall quality of human annotated datasets question the above-mentioned `official' results obtained by this system, that ranked second (0.91 F1-score) and first (0.916 F1-score), respectively, in the two PharmaCoNER subtasks.

* 8 pages; unpublished contribution to the PharmaCoNER shared task held as part of BioNLP-OST 2019 

  Access Paper or Ask Questions

Data Exploration and Validation on dense knowledge graphs for biomedical research

Dec 08, 2019
Jens Dörpinghaus, Alexander Apke, Vanessa Lage-Rupprecht, Andreas Stefan

Here we present a holistic approach for data exploration on dense knowledge graphs as a novel approach with a proof-of-concept in biomedical research. Knowledge graphs are increasingly becoming a vital factor in knowledge mining and discovery as they connect data using technologies from the semantic web. In this paper we extend a basic knowledge graph extracted from biomedical literature by context data like named entities and relations obtained by text mining and other linked data sources like ontologies and databases. We will present an overview about this novel network. The aim of this work was to extend this current knowledge with approaches from graph theory. This method will build the foundation for quality control, validation of hypothesis, detection of missing data and time series analysis of biomedical knowledge in general. In this context we tried to apply multiple-valued decision diagrams to these questions. In addition this knowledge representation of linked data can be used as FAIR approach to answer semantic questions. This paper sheds new lights on dense and very large knowledge graphs and the importance of a graph-theoretic understanding of these networks.

  Access Paper or Ask Questions

Non-Intrusive Load Monitoring with an Attention-based Deep Neural Network

Nov 15, 2019
Antonio Maria Sudoso, Veronica Piccialli

Energy disaggregation, also referred to as a Non-Intrusive Load Monitoring (NILM), is the task of using an aggregate energy signal, for example coming from a whole-home power monitor, to make inferences about the different individual loads of the system. In this paper, we present a novel approach based on the encoder-decoder deep learning framework with an attention mechanism for solving NILM. The attention mechanism is inspired by the temporal attention mechanism that has been recently applied to get state-of-the-art results in neural machine translation, text summarization and speech recognition. The experiments have been conducted on two publicly available datasets AMPds and UK-DALE in seen and unseen conditions. The results show that our proposed deep neural network outperforms the state-of-the-art Denoising Auto-Encoder (DAE) proposed initially by Kelly and Knottenbely (2015) and its extended and improved architecture by Bonfigli et al. (2018), in all the addressed experimental conditions. We also show that modeling attention translates into the ability to correctly detect the state change of each appliance, that is of extreme interest in the field of energy disaggregation.

  Access Paper or Ask Questions

Generative Models for Effective ML on Private, Decentralized Datasets

Nov 15, 2019
Sean Augenstein, H. Brendan McMahan, Daniel Ramage, Swaroop Ramaswamy, Peter Kairouz, Mingqing Chen, Rajiv Mathews, Blaise Aguera y Arcas

To improve real-world applications of machine learning, experienced modelers develop intuition about their datasets, their models, and how the two interact. Manual inspection of raw data - of representative samples, of outliers, of misclassifications - is an essential tool in a) identifying and fixing problems in the data, b) generating new modeling hypotheses, and c) assigning or refining human-provided labels. However, manual data inspection is problematic for privacy sensitive datasets, such as those representing the behavior of real-world individuals. Furthermore, manual data inspection is impossible in the increasingly important setting of federated learning, where raw examples are stored at the edge and the modeler may only access aggregated outputs such as metrics or model parameters. This paper demonstrates that generative models - trained using federated methods and with formal differential privacy guarantees - can be used effectively to debug many commonly occurring data issues even when the data cannot be directly inspected. We explore these methods in applications to text with differentially private federated RNNs and to images using a novel algorithm for differentially private federated GANs.

* 27 pages, 8 figures 

  Access Paper or Ask Questions

Enriching Conversation Context in Retrieval-based Chatbots

Nov 06, 2019
Amir Vakili Tahami, Azadeh Shakery

Work on retrieval-based chatbots, like most sequence pair matching tasks, can be divided into Cross-encoders that perform word matching over the pair, and Bi-encoders that encode the pair separately. The latter has better performance, however since candidate responses cannot be encoded offline, it is also much slower. Lately, multi-layer transformer architectures pre-trained as language models have been used to great effect on a variety of natural language processing and information retrieval tasks. Recent work has shown that these language models can be used in text-matching scenarios to create Bi-encoders that perform almost as well as Cross-encoders while having a much faster inference speed. In this paper, we expand upon this work by developing a sequence matching architecture that %takes into account contexts in the training dataset at inference time. utilizes the entire training set as a makeshift knowledge-base during inference. We perform detailed experiments demonstrating that this architecture can be used to further improve Bi-encoders performance while still maintaining a relatively high inference speed.

* 8 pages, 1 figure, 3 tables 

  Access Paper or Ask Questions

Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments

Oct 29, 2019
Martin Weiss, Simon Chamorro, Roger Girgis, Margaux Luck, Samira E. Kahou, Joseph P. Cohen, Derek Nowrouzezahrai, Doina Precup, Florian Golemo, Chris Pal

Millions of blind and visually-impaired (BVI) people navigate urban environments every day, using smartphones for high-level path-planning and white canes or guide dogs for local information. However, many BVI people still struggle to travel to new places. In our endeavor to create a navigation assistant for the BVI, we found that existing Reinforcement Learning (RL) environments were unsuitable for the task. This work introduces SEVN, a sidewalk simulation environment and a neural network-based approach to creating a navigation agent. SEVN contains panoramic images with labels for house numbers, doors, and street name signs, and formulations for several navigation tasks. We study the performance of an RL algorithm (PPO) in this setting. Our policy model fuses multi-modal observations in the form of variable resolution images, visible text, and simulated GPS data to navigate to a goal door. We hope that this dataset, simulator, and experimental results will provide a foundation for further research into the creation of agents that can assist members of the BVI community with outdoor navigation.

* Accepted at CoRL2019. Code & video available at 

  Access Paper or Ask Questions

A Novel Hierarchical Binary Tagging Framework for Joint Extraction of Entities and Relations

Sep 07, 2019
Zhepei Wei, Jianlin Su, Yue Wang, Yuan Tian, Yi Chang

Extracting relational triples from unstructured text is crucial for large-scale knowledge graph construction. However, few existing works excel in solving the overlapping triple problem where multiple relational triples in the same sentence share the same entities. We propose a novel Hierarchical Binary Tagging (HBT) framework derived from a principled problem formulation. Instead of treating relations as discrete labels as in previous works, our new framework models relations as functions that map subjects to objects in a sentence, which naturally handles overlapping triples. Experiments show that the proposed framework already outperforms state-of-the-art methods even its encoder module uses a randomly initialized BERT encoder, showing the power of the new tagging framework. It enjoys further performance boost when employing a pretrained BERT encoder, outperforming the strongest baseline by 25.6 and 45.9 absolute gain in F1-score on two public datasets NYT and WebNLG, respectively. In-depth analysis on different types of overlapping triples shows that the method delivers consistent performance gain in all scenarios.

* 9 pages, 4 figures 

  Access Paper or Ask Questions