Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alex Ratner

Proceedings of the First Workshop on Weakly Supervised Learning

Jul 08, 2021

Michael A. Hedderich, Benjamin Roth, Katharina Kann, Barbara Plank, Alex Ratner, Dietrich Klakow

Figure 1 for Proceedings of the First Workshop on Weakly Supervised Learning

Figure 2 for Proceedings of the First Workshop on Weakly Supervised Learning

Figure 3 for Proceedings of the First Workshop on Weakly Supervised Learning

Figure 4 for Proceedings of the First Workshop on Weakly Supervised Learning

Abstract:Welcome to WeaSuL 2021, the First Workshop on Weakly Supervised Learning, co-located with ICLR 2021. In this workshop, we want to advance theory, methods and tools for allowing experts to express prior coded knowledge for automatic data annotations that can be used to train arbitrary deep neural networks for prediction. The ICLR 2021 Workshop on Weak Supervision aims at advancing methods that help modern machine-learning methods to generalize from knowledge provided by experts, in interaction with observable (unlabeled) data. In total, 15 papers were accepted. All the accepted contributions are listed in these Proceedings.

Via

Access Paper or Ask Questions

SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data

Apr 20, 2017

Jason Fries, Sen Wu, Alex Ratner, Christopher Ré

Figure 1 for SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data

Figure 2 for SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data

Figure 3 for SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data

Figure 4 for SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data

Abstract:We present SwellShark, a framework for building biomedical named entity recognition (NER) systems quickly and without hand-labeled data. Our approach views biomedical resources like lexicons as function primitives for autogenerating weak supervision. We then use a generative model to unify and denoise this supervision and construct large-scale, probabilistically labeled datasets for training high-accuracy NER taggers. In three biomedical NER tasks, SwellShark achieves competitive scores with state-of-the-art supervised benchmarks using no hand-labeled training data. In a drug name extraction task using patient medical records, one domain expert using SwellShark achieved within 5.1% of a crowdsourced annotation approach -- which originally utilized 20 teams over the course of several weeks -- in 24 hours.

Via

Access Paper or Ask Questions