Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shrimai Prabhumoye

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Dec 15, 2021

Shrimai Prabhumoye, Rafal Kocielnik, Mohammad Shoeybi, Anima Anandkumar, Bryan Catanzaro

Figure 1 for Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Figure 2 for Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Figure 3 for Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Figure 4 for Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Abstract:Detecting social bias in text is challenging due to nuance, subjectivity, and difficulty in obtaining good quality labeled datasets at scale, especially given the evolving nature of social biases and society. To address these challenges, we propose a few-shot instruction-based method for prompting pre-trained language models (LMs). We select a few label-balanced exemplars from a small support repository that are closest to the query to be labeled in the embedding space. We then provide the LM with instruction that consists of this subset of labeled exemplars, the query text to be classified, a definition of bias, and prompt it to make a decision. We demonstrate that large LMs used in a few-shot context can detect different types of fine-grained biases with similar and sometimes superior accuracy to fine-tuned models. We observe that the largest 530B parameter model is significantly more effective in detecting social bias compared to smaller models (achieving at least 20% improvement in AUC metric compared to other models). It also maintains a high AUC (dropping less than 5%) in a few-shot setting with a labeled repository reduced to as few as 100 samples. Large pretrained language models thus make it easier and quicker to build new bias detectors.

Via

Access Paper or Ask Questions

Focused Attention Improves Document-Grounded Generation

Apr 26, 2021

Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov

Figure 1 for Focused Attention Improves Document-Grounded Generation

Figure 2 for Focused Attention Improves Document-Grounded Generation

Figure 3 for Focused Attention Improves Document-Grounded Generation

Figure 4 for Focused Attention Improves Document-Grounded Generation

Abstract:Document grounded generation is the task of using the information provided in a document to improve text generation. This work focuses on two different document grounded generation tasks: Wikipedia Update Generation task and Dialogue response generation. Our work introduces two novel adaptations of large scale pre-trained encoder-decoder models focusing on building context driven representation of the document and enabling specific attention to the information in the document. Additionally, we provide a stronger BART baseline for these tasks. Our proposed techniques outperform existing methods on both automated (at least 48% increase in BLEU-4 points) and human evaluation for closeness to reference and relevance to the document. Furthermore, we perform comprehensive manual inspection of the generated output and categorize errors to provide insights into future directions in modeling these tasks.

* Accepted at North American Chapter of the Association for Computational Linguistics (NAACL) 2021

Via

Access Paper or Ask Questions

CURIE: An Iterative Querying Approach for Reasoning About Situations

Apr 05, 2021

Dheeraj Rajagopal, Aman Madaan, Niket Tandon, Yiming Yang, Shrimai Prabhumoye, Abhilasha Ravichander, Peter Clark, Eduard Hovy

Figure 1 for CURIE: An Iterative Querying Approach for Reasoning About Situations

Figure 2 for CURIE: An Iterative Querying Approach for Reasoning About Situations

Figure 3 for CURIE: An Iterative Querying Approach for Reasoning About Situations

Figure 4 for CURIE: An Iterative Querying Approach for Reasoning About Situations

Abstract:Recently, models have been shown to predict the effects of unexpected situations, e.g., would cloudy skies help or hinder plant growth? Given a context, the goal of such situational reasoning is to elicit the consequences of a new situation (st) that arises in that context. We propose a method to iteratively build a graph of relevant consequences explicitly in a structured situational graph (st-graph) using natural language queries over a finetuned language model (M). Across multiple domains, CURIE generates st-graphs that humans find relevant and meaningful in eliciting the consequences of a new situation. We show that st-graphs generated by CURIE improve a situational reasoning end task (WIQA-QA) by 3 points on accuracy by simply augmenting their input with our generated situational graphs, especially for a hard subset that requires background knowledge and multi-hop reasoning.

* This paper builds upon EIGEN (arXiv:2010.11764) and proposes a general framework for situational reasoning

Via

Access Paper or Ask Questions

EIGEN: Event Influence GENeration using Pre-trained Language Models

Oct 22, 2020

Aman Madaan, Dheeraj Rajagopal, Yiming Yang, Abhilasha Ravichander, Eduard Hovy, Shrimai Prabhumoye

Figure 1 for EIGEN: Event Influence GENeration using Pre-trained Language Models

Figure 2 for EIGEN: Event Influence GENeration using Pre-trained Language Models

Figure 3 for EIGEN: Event Influence GENeration using Pre-trained Language Models

Figure 4 for EIGEN: Event Influence GENeration using Pre-trained Language Models

Abstract:Reasoning about events and tracking their influences is fundamental to understanding processes. In this paper, we present EIGEN - a method to leverage pre-trained language models to generate event influences conditioned on a context, nature of their influence, and the distance in a reasoning chain. We also derive a new dataset for research and evaluation of methods for event influence generation. EIGEN outperforms strong baselines both in terms of automated evaluation metrics (by 10 ROUGE points) and human judgments on closeness to reference and relevance of generations. Furthermore, we show that the event influences generated by EIGEN improve the performance on a "what-if" Question Answering (WIQA) benchmark (over 3% F1), especially for questions that require background knowledge and multi-hop reasoning.

Via

Access Paper or Ask Questions

Case Study: Deontological Ethics in NLP

Oct 09, 2020

Shrimai Prabhumoye, Brendon Boldt, Ruslan Salakhutdinov, Alan W Black

Figure 1 for Case Study: Deontological Ethics in NLP

Abstract:Recent work in natural language processing (NLP) has focused on ethical challenges such as understanding and mitigating bias in data and algorithms; identifying objectionable content like hate speech, stereotypes and offensive language; and building frameworks for better system design and data handling practices. However, there has been little discussion about the ethical foundations that underlie these efforts. In this work, we study one ethical theory, namely deontological ethics, from the perspective of NLP. In particular, we focus on the generalization principle and the respect for autonomy through informed consent. We provide four case studies to demonstrate how these principles can be used with NLP systems. We also recommend directions to avoid the ethical issues in these systems.

Via

Access Paper or Ask Questions

Exploring Controllable Text Generation Techniques

May 04, 2020

Shrimai Prabhumoye, Alan W Black, Ruslan Salakhutdinov

Figure 1 for Exploring Controllable Text Generation Techniques

Abstract:Neural controllable text generation is an important area gaining attention due to its plethora of applications. In this work, we provide a new schema of the pipeline of the generation process by classifying it into five modules. We present an overview of the various techniques used to modulate each of these five modules to provide with control of attributes in the generation process. We also provide an analysis on the advantages and disadvantages of these techniques and open paths to develop new architectures based on the combination of the modules described in this paper.

Via

Access Paper or Ask Questions

Politeness Transfer: A Tag and Generate Approach

May 01, 2020

Aman Madaan, Amrith Setlur, Tanmay Parekh, Barnabas Poczos, Graham Neubig, Yiming Yang, Ruslan Salakhutdinov, Alan W Black, Shrimai Prabhumoye

Figure 1 for Politeness Transfer: A Tag and Generate Approach

Figure 2 for Politeness Transfer: A Tag and Generate Approach

Figure 3 for Politeness Transfer: A Tag and Generate Approach

Figure 4 for Politeness Transfer: A Tag and Generate Approach

Abstract:This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning. We also provide a dataset of more than 1.39 instances automatically labeled for politeness to encourage benchmark evaluations on this new task. We design a tag and generate pipeline that identifies stylistic attributes and subsequently generates a sentence in the target style while preserving most of the source content. For politeness as well as five other transfer tasks, our model outperforms the state-of-the-art methods on automatic metrics for content preservation, with a comparable or better performance on style transfer accuracy. Additionally, our model surpasses existing methods on human evaluations for grammaticality, meaning preservation and transfer accuracy across all the six style transfer tasks. The data and code is located at https://github.com/tag-and-generate.

* To appear at ACL 2020

Via

Access Paper or Ask Questions

Topological Sort for Sentence Ordering

May 01, 2020

Shrimai Prabhumoye, Ruslan Salakhutdinov, Alan W Black

Figure 1 for Topological Sort for Sentence Ordering

Figure 2 for Topological Sort for Sentence Ordering

Figure 3 for Topological Sort for Sentence Ordering

Figure 4 for Topological Sort for Sentence Ordering

Abstract:Sentence ordering is the task of arranging the sentences of a given text in the correct order. Recent work using deep neural networks for this task has framed it as a sequence prediction problem. In this paper, we propose a new framing of this task as a constraint solving problem and introduce a new technique to solve it. Additionally, we propose a human evaluation for this task. The results on both automatic and human metrics across four different datasets show that this new technique is better at capturing coherence in documents.

* Will be published at the Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL) 2020

Via

Access Paper or Ask Questions

I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents

Feb 10, 2020

Shrimai Prabhumoye, Margaret Li, Jack Urbanek, Emily Dinan, Douwe Kiela, Jason Weston, Arthur Szlam

Figure 1 for I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents

Figure 2 for I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents

Figure 3 for I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents

Figure 4 for I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents

Abstract:Dialogue research tends to distinguish between chit-chat and goal-oriented tasks. While the former is arguably more naturalistic and has a wider use of language, the latter has clearer metrics and a straightforward learning signal. Humans effortlessly combine the two, for example engaging in chit-chat with the goal of exchanging information or eliciting a specific response. Here, we bridge the divide between these two domains in the setting of a rich multi-player text-based fantasy environment where agents and humans engage in both actions and dialogue. Specifically, we train a goal-oriented model with reinforcement learning against an imitation-learned ``chit-chat'' model with two approaches: the policy either learns to pick a topic or learns to pick an utterance given the top-K utterances from the chit-chat model. We show that both models outperform an inverse model baseline and can converse naturally with their dialogue partner in order to achieve goals.

Via

Access Paper or Ask Questions

Modeling Product Search Relevance in e-Commerce

Jan 14, 2020

Rahul Radhakrishnan Iyer, Rohan Kohli, Shrimai Prabhumoye

Figure 1 for Modeling Product Search Relevance in e-Commerce

Figure 2 for Modeling Product Search Relevance in e-Commerce

Figure 3 for Modeling Product Search Relevance in e-Commerce

Figure 4 for Modeling Product Search Relevance in e-Commerce

Abstract:With the rapid growth of e-Commerce, online product search has emerged as a popular and effective paradigm for customers to find desired products and engage in online shopping. However, there is still a big gap between the products that customers really desire to purchase and relevance of products that are suggested in response to a query from the customer. In this paper, we propose a robust way of predicting relevance scores given a search query and a product, using techniques involving machine learning, natural language processing and information retrieval. We compare conventional information retrieval models such as BM25 and Indri with deep learning models such as word2vec, sentence2vec and paragraph2vec. We share some of our insights and findings from our experiments.

* 11 pages, 3 figures

Via

Access Paper or Ask Questions