Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Regina Barzilay

Nutri-bullets: Summarizing Health Studies by Composing Segments

Mar 22, 2021
Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay

Figure 1 for Nutri-bullets: Summarizing Health Studies by Composing Segments

Figure 2 for Nutri-bullets: Summarizing Health Studies by Composing Segments

Figure 3 for Nutri-bullets: Summarizing Health Studies by Composing Segments

Figure 4 for Nutri-bullets: Summarizing Health Studies by Composing Segments

We introduce \emph{Nutri-bullets}, a multi-document summarization task for health and nutrition. First, we present two datasets of food and health summaries from multiple scientific studies. Furthermore, we propose a novel \emph{extract-compose} model to solve the problem in the regime of limited parallel data. We explicitly select key spans from several abstracts using a policy network, followed by composing the selected spans to present a summary via a task specific language model. Compared to state-of-the-art methods, our approach leads to more faithful, relevant and diverse summarization -- properties imperative to this application. For instance, on the BreastCancer dataset our approach gets a more than 50\% improvement on relevance and faithfulness.\footnote{Our code and data is available at \url{https://github.com/darsh10/Nutribullets.}}

* AAAI 2021 Camera Ready
* 12 pages

Via

Access Paper or Ask Questions

Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence

Mar 15, 2021
Tal Schuster, Adam Fisch, Regina Barzilay

Typical fact verification models use retrieved written evidence to verify claims. Evidence sources, however, often change over time as more information is gathered and revised. In order to adapt, models must be sensitive to subtle differences in supporting evidence. We present VitaminC, a benchmark infused with challenging cases that require fact verification models to discern and adjust to slight factual changes. We collect over 100,000 Wikipedia revisions that modify an underlying fact, and leverage these revisions, together with additional synthetically constructed ones, to create a total of over 400,000 claim-evidence pairs. Unlike previous resources, the examples in VitaminC are contrastive, i.e., they contain evidence pairs that are nearly identical in language and content, with the exception that one supports a given claim while the other does not. We show that training using this design increases robustness -- improving accuracy by 10% on adversarial fact verification and 6% on adversarial natural language inference (NLI). Moreover, the structure of VitaminC leads us to define additional tasks for fact-checking resources: tagging relevant words in the evidence for verifying the claim, identifying factual revisions, and providing automatic edits via factually consistent text generation.

* NAACL 2021

Via

Access Paper or Ask Questions

Few-shot Conformal Prediction with Auxiliary Tasks

Feb 17, 2021
Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay

Figure 1 for Few-shot Conformal Prediction with Auxiliary Tasks

Figure 2 for Few-shot Conformal Prediction with Auxiliary Tasks

Figure 3 for Few-shot Conformal Prediction with Auxiliary Tasks

Figure 4 for Few-shot Conformal Prediction with Auxiliary Tasks

We develop a novel approach to conformal prediction when the target task has limited data available for training. Conformal prediction identifies a small set of promising output candidates in place of a single prediction, with guarantees that the set contains the correct answer with high probability. When training data is limited, however, the predicted set can easily become unusably large. In this work, we obtain substantially tighter prediction sets while maintaining desirable marginal guarantees by casting conformal prediction as a meta-learning paradigm over exchangeable collections of auxiliary tasks. Our conformalization algorithm is simple, fast, and agnostic to the choice of underlying model, learning algorithm, or dataset. We demonstrate the effectiveness of this approach across a number of few-shot classification and regression tasks in natural language processing, computer vision, and computational chemistry for drug discovery.

Via

Access Paper or Ask Questions

CapWAP: Captioning with a Purpose

Nov 09, 2020
Adam Fisch, Kenton Lee, Ming-Wei Chang, Jonathan H. Clark, Regina Barzilay

Figure 1 for CapWAP: Captioning with a Purpose

Figure 2 for CapWAP: Captioning with a Purpose

Figure 3 for CapWAP: Captioning with a Purpose

Figure 4 for CapWAP: Captioning with a Purpose

The traditional image captioning task uses generic reference captions to provide textual information about images. Different user populations, however, will care about different visual aspects of images. In this paper, we propose a new task, Captioning with a Purpose (CapWAP). Our goal is to develop systems that can be tailored to be useful for the information needs of an intended population, rather than merely provide generic information about an image. In this task, we use question-answer (QA) pairs---a natural expression of information need---from users, instead of reference captions, for both training and post-inference evaluation. We show that it is possible to use reinforcement learning to directly optimize for the intended information need, by rewarding outputs that allow a question answering model to provide correct answers to sampled user questions. We convert several visual question answering datasets into CapWAP datasets, and demonstrate that under a variety of scenarios our purposeful captioning system learns to anticipate and fulfill specific information needs better than its generic counterparts, as measured by QA performance on user questions from unseen images, when using the caption alone as context.

* EMNLP 2020

Via

Access Paper or Ask Questions

Modeling Drug Combinations based on Molecular Structures and Biological Targets

Nov 09, 2020
Wengong Jin, Regina Barzilay, Tommi Jaakkola

Figure 1 for Modeling Drug Combinations based on Molecular Structures and Biological Targets

Figure 2 for Modeling Drug Combinations based on Molecular Structures and Biological Targets

Figure 3 for Modeling Drug Combinations based on Molecular Structures and Biological Targets

Drug combinations play an important role in therapeutics due to its better efficacy and reduced toxicity. Since validating drug combinations via direct screening is prohibitively expensive due to combinatorial explosion, recent approaches have applied machine learning to identify synergistic combinations for cancer. However, these approaches is not readily applicable to many diseases with limited combination data. Motivated by the fact that drug synergy is closely tied with biological targets, we propose a model that jointly learns drug-target interaction and drug synergy. The model, parametrized as a graph convolutional network, consists of two parts: a drug-target interaction and target-disease association module. These modules are trained together on drug combination screen as well as abundant drug-target interaction data. Our model is trained and evaluated on two SARS-CoV-2 drug combination screens and achieves 0.777 test AUC, which is 10% higher than the model trained without drug-target interaction.

* Accepted to NeurIPS 2020 Machine Learning for Molecules Workshop

Via

Access Paper or Ask Questions

Deciphering Undersegmented Ancient Scripts Using Phonetic Prior

Oct 21, 2020
Jiaming Luo, Frederik Hartmann, Enrico Santus, Yuan Cao, Regina Barzilay

Figure 1 for Deciphering Undersegmented Ancient Scripts Using Phonetic Prior

Figure 2 for Deciphering Undersegmented Ancient Scripts Using Phonetic Prior

Figure 3 for Deciphering Undersegmented Ancient Scripts Using Phonetic Prior

Figure 4 for Deciphering Undersegmented Ancient Scripts Using Phonetic Prior

Most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges: (1) the scripts are not fully segmented into words; (2) the closest known language is not determined. We propose a decipherment model that handles both of these challenges by building on rich linguistic constraints reflecting consistent patterns in historical sound change. We capture the natural phonological geometry by learning character embeddings based on the International Phonetic Alphabet (IPA). The resulting generative framework jointly models word segmentation and cognate alignment, informed by phonological constraints. We evaluate the model on both deciphered languages (Gothic, Ugaritic) and an undeciphered one (Iberian). The experiments show that incorporating phonetic geometry leads to clear and consistent gains. Additionally, we propose a measure for language closeness which correctly identifies related languages for Gothic and Ugaritic. For Iberian, the method does not show strong evidence supporting Basque as a related language, concurring with the favored position by the current scholarship.

* TACL 2020, pre-MIT Press publication version

Via

Access Paper or Ask Questions

Relaxed Conformal Prediction Cascades for Efficient Inference Over Many Labels

Jul 08, 2020
Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay

Figure 1 for Relaxed Conformal Prediction Cascades for Efficient Inference Over Many Labels

Figure 2 for Relaxed Conformal Prediction Cascades for Efficient Inference Over Many Labels

Figure 3 for Relaxed Conformal Prediction Cascades for Efficient Inference Over Many Labels

Figure 4 for Relaxed Conformal Prediction Cascades for Efficient Inference Over Many Labels

Providing a small set of promising candidates in place of a single prediction is well-suited for many open-ended classification tasks. Conformal Prediction (CP) is a technique for creating classifiers that produce a valid set of predictions that contains the true answer with arbitrarily high probability. In practice, however, standard CP can suffer from both low predictive and computational efficiency during inference---i.e., the predicted set is both unusably large, and costly to obtain. This is particularly pervasive in the considered setting, where the correct answer is not unique and the number of total possible answers is high. In this work, we develop two simple and complementary techniques for improving both types of efficiencies. First, we relax CP validity to arbitrary criterions of success---allowing our framework to make more efficient predictions while remaining "equivalently correct." Second, we amortize cost by conformalizing prediction cascades, in which we aggressively prune implausible labels early on by using progressively stronger classifiers---while still guaranteeing marginal coverage. We demonstrate the empirical effectiveness of our approach for multiple applications in natural language processing and computational chemistry for drug discovery.

Via

Access Paper or Ask Questions

Domain Extrapolation via Regret Minimization

Jun 24, 2020
Wengong Jin, Regina Barzilay, Tommi Jaakkola

Figure 1 for Domain Extrapolation via Regret Minimization

Figure 2 for Domain Extrapolation via Regret Minimization

Figure 3 for Domain Extrapolation via Regret Minimization

Figure 4 for Domain Extrapolation via Regret Minimization

Many real prediction tasks such as molecular property prediction require ability to extrapolate to unseen domains. The success in these tasks typically hinges on finding a good representation. In this paper, we extend invariant risk minimization (IRM) by recasting the simultaneous optimality condition in terms of regret, finding instead a representation that enables the predictor to be optimal against an oracle with hindsight access on held-out environments. The change refocuses the principle on generalization and doesn't collapse even with strong predictors that can perfectly fit all the training data. Our regret minimization (RGM) approach can be further combined with adaptive domain perturbations to handle combinatorially defined environments. We evaluate our method on two real-world applications: molecule property prediction and protein homology detection and show that RGM significantly outperforms previous state-of-the-art domain generalization techniques.

Via

Access Paper or Ask Questions

Improved Conditional Flow Models for Molecule to Image Synthesis

Jun 15, 2020
Karren Yang, Samuel Goldman, Wengong Jin, Alex Lu, Regina Barzilay, Tommi Jaakkola, Caroline Uhler

Figure 1 for Improved Conditional Flow Models for Molecule to Image Synthesis

Figure 2 for Improved Conditional Flow Models for Molecule to Image Synthesis

Figure 3 for Improved Conditional Flow Models for Molecule to Image Synthesis

Figure 4 for Improved Conditional Flow Models for Molecule to Image Synthesis

In this paper, we aim to synthesize cell microscopy images under different molecular interventions, motivated by practical applications to drug development. Building on the recent success of graph neural networks for learning molecular embeddings and flow-based models for image generation, we propose Mol2Image: a flow-based generative model for molecule to cell image synthesis. To generate cell features at different resolutions and scale to high-resolution images, we develop a novel multi-scale flow architecture based on a Haar wavelet image pyramid. To maximize the mutual information between the generated images and the molecular interventions, we devise a training strategy based on contrastive learning. To evaluate our model, we propose a new set of metrics for biological image generation that are robust, interpretable, and relevant to practitioners. We show quantitatively that our method learns a meaningful embedding of the molecular intervention, which is translated into an image representation reflecting the biological effects of the intervention.

Via

Access Paper or Ask Questions

Learning Graph Models for Template-Free Retrosynthesis

Jun 12, 2020
Vignesh Ram Somnath, Charlotte Bunne, Connor W. Coley, Andreas Krause, Regina Barzilay

Figure 1 for Learning Graph Models for Template-Free Retrosynthesis

Figure 2 for Learning Graph Models for Template-Free Retrosynthesis

Figure 3 for Learning Graph Models for Template-Free Retrosynthesis

Figure 4 for Learning Graph Models for Template-Free Retrosynthesis

Retrosynthesis prediction is a fundamental problem in organic synthesis, where the task is to identify precursor molecules that can be used to synthesize a target molecule. Despite recent advancements in neural retrosynthesis algorithms, they are unable to fully recapitulate the strategies employed by chemists and do not generalize well to infrequent reaction types. In this paper, we propose a graph-based approach that capitalizes on the idea that the graph topology of precursor molecules is largely unaltered during the reaction. The model first predicts the set of graph edits transforming the target into incomplete molecules called synthons. Next, the model learns to expand synthons into complete molecules by attaching relevant leaving groups. Since the model operates at the level of molecular fragments, it avoids full generation, greatly simplifying the underlying architecture and improving its ability to generalize. The model yields $11.7\%$ absolute improvement over state-of-the-art approaches on the USPTO-50k dataset, and a $4\%$ absolute improvement on a rare reaction subset of the same dataset.

Via

Access Paper or Ask Questions