Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Biomedical Named Entity Recognition via Reference-Set Augmented Bootstrapping

Jun 01, 2019

Joel Mathew, Shobeir Fakhraei, José Luis Ambite

Figure 1 for Biomedical Named Entity Recognition via Reference-Set Augmented Bootstrapping

Figure 2 for Biomedical Named Entity Recognition via Reference-Set Augmented Bootstrapping

Figure 3 for Biomedical Named Entity Recognition via Reference-Set Augmented Bootstrapping

Share this with someone who'll enjoy it:

Abstract:We present a weakly-supervised data augmentation approach to improve Named Entity Recognition (NER) in a challenging domain: extracting biomedical entities (e.g., proteins) from the scientific literature. First, we train a neural NER (NNER) model over a small seed of fully-labeled examples. Second, we use a reference set of entity names (e.g., proteins in UniProt) to identify entity mentions with high precision, but low recall, on an unlabeled corpus. Third, we use the NNER model to assign weak labels to the corpus. Finally, we retrain our NNER model iteratively over the augmented training set, including the seed, the reference-set examples, and the weakly-labeled examples, which improves model performance. We show empirically that this augmented bootstrapping process significantly improves NER performance, and discuss the factors impacting the efficacy of the approach.

* 5 pages, 1 Figure, 2 Table, ICML 2019 Workshop on Computational Biology

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Biomedical Named Entity Recognition via Reference-Set Augmented Bootstrapping

Paper and Code