Alert button
Picture for Debanjan Ghosh

Debanjan Ghosh

Alert button

The Benefits of Label-Description Training for Zero-Shot Text Classification

May 03, 2023
Lingyu Gao, Debanjan Ghosh, Kevin Gimpel

Figure 1 for The Benefits of Label-Description Training for Zero-Shot Text Classification
Figure 2 for The Benefits of Label-Description Training for Zero-Shot Text Classification
Figure 3 for The Benefits of Label-Description Training for Zero-Shot Text Classification
Figure 4 for The Benefits of Label-Description Training for Zero-Shot Text Classification

Large language models have improved zero-shot text classification by allowing the transfer of semantic knowledge from the training data in order to classify among specific label sets in downstream tasks. We propose a simple way to further improve zero-shot accuracies with minimal effort. We curate small finetuning datasets intended to describe the labels for a task. Unlike typical finetuning data, which has texts annotated with labels, our data simply describes the labels in language, e.g., using a few related terms, dictionary/encyclopedia entries, and short templates. Across a range of topic and sentiment datasets, our method is more accurate than zero-shot by 15-17% absolute. It is also more robust to choices required for zero-shot classification, such as patterns for prompting the model to classify and mappings from labels to tokens in the model's vocabulary. Furthermore, since our data merely describes the labels but does not use input texts, finetuning on it yields a model that performs strongly on multiple text domains for a given label set, even improving over few-shot out-of-domain classification in multiple settings.

Viaarxiv icon

Controlled Language Generation for Language Learning Items

Nov 28, 2022
Kevin Stowe, Debanjan Ghosh, Mengxuan Zhao

Figure 1 for Controlled Language Generation for Language Learning Items
Figure 2 for Controlled Language Generation for Language Learning Items
Figure 3 for Controlled Language Generation for Language Learning Items
Figure 4 for Controlled Language Generation for Language Learning Items

This work aims to employ natural language generation (NLG) to rapidly generate items for English language learning applications: this requires both language models capable of generating fluent, high-quality English, and to control the output of the generation to match the requirements of the relevant items. We experiment with deep pretrained models for this task, developing novel methods for controlling items for factors relevant in language learning: diverse sentences for different proficiency levels and argument structure to test grammar. Human evaluation demonstrates high grammatically scores for all models (3.4 and above out of 4), and higher length (24%) and complexity (9%) over the baseline for the advanced proficiency model. Our results show that we can achieve strong performance while adding additional control to ensure diverse, tailored content for individual users.

* 9 pages, 3 figures. Accepted to Industry Track at EMNLP 2022 
Viaarxiv icon

AGReE: A system for generating Automated Grammar Reading Exercises

Nov 03, 2022
Sophia Chan, Swapna Somasundaran, Debanjan Ghosh, Mengxuan Zhao

Figure 1 for AGReE: A system for generating Automated Grammar Reading Exercises
Figure 2 for AGReE: A system for generating Automated Grammar Reading Exercises
Figure 3 for AGReE: A system for generating Automated Grammar Reading Exercises
Figure 4 for AGReE: A system for generating Automated Grammar Reading Exercises

We describe the AGReE system, which takes user-submitted passages as input and automatically generates grammar practice exercises that can be completed while reading. Multiple-choice practice items are generated for a variety of different grammar constructs: punctuation, articles, conjunctions, pronouns, prepositions, verbs, and nouns. We also conducted a large-scale human evaluation with around 4,500 multiple-choice practice items. We notice for 95% of items, a majority of raters out of five were able to identify the correct answer and for 85% of cases, raters agree that there is only one correct answer among the choices. Finally, the error analysis shows that raters made the most mistakes for punctuation and conjunctions.

* Accepted to EMNLP 2022 Demonstration Track 
Viaarxiv icon

FLUTE: Figurative Language Understanding and Textual Explanations

May 24, 2022
Tuhin Chakrabarty, Arkadiy Saakyan, Debanjan Ghosh, Smaranda Muresan

Figure 1 for FLUTE: Figurative Language Understanding and Textual Explanations
Figure 2 for FLUTE: Figurative Language Understanding and Textual Explanations
Figure 3 for FLUTE: Figurative Language Understanding and Textual Explanations
Figure 4 for FLUTE: Figurative Language Understanding and Textual Explanations

In spite of the prevalence of figurative language, transformer-based models struggle to demonstrate an understanding of it. Meanwhile, even classical natural language inference (NLI) tasks have been plagued by spurious correlations and annotation artifacts. Datasets like eSNLI have been released, allowing to probe whether language models are right for the right reasons. Yet no such data exists for figurative language, making it harder to asses genuine understanding of such expressions. In light of the above, we release FLUTE, a dataset of 8,000 figurative NLI instances with explanations, spanning three categories: Sarcasm, Simile, and Metaphor. We collect the data through the Human-AI collaboration framework based on GPT-3, crowdworkers, and expert annotation. We show how utilizing GPT-3 in conjunction with human experts can aid in scaling up the creation of datasets even for such complex linguistic phenomena as figurative language. Baseline performance of the T5 model shows our dataset is a challenging testbed for figurative language understanding.

* Work in progress 
Viaarxiv icon

"What makes a question inquisitive?" A Study on Type-Controlled Inquisitive Question Generation

May 19, 2022
Lingyu Gao, Debanjan Ghosh, Kevin Gimpel

Figure 1 for "What makes a question inquisitive?" A Study on Type-Controlled Inquisitive Question Generation
Figure 2 for "What makes a question inquisitive?" A Study on Type-Controlled Inquisitive Question Generation
Figure 3 for "What makes a question inquisitive?" A Study on Type-Controlled Inquisitive Question Generation
Figure 4 for "What makes a question inquisitive?" A Study on Type-Controlled Inquisitive Question Generation

We propose a type-controlled framework for inquisitive question generation. We annotate an inquisitive question dataset with question types, train question type classifiers, and finetune models for type-controlled question generation. Empirical results demonstrate that we can generate a variety of questions that adhere to specific types while drawing from the source texts. We also investigate strategies for selecting a single question from a generated set, considering both an informative vs.~inquisitive question classifier and a pairwise ranker trained from a small set of expert annotations. Question selection using the pairwise ranker yields strong results in automatic and manual evaluation. Our human evaluation assesses multiple aspects of the generated questions, finding that the ranker chooses questions with the best syntax (4.59), semantics (4.37), and inquisitiveness (3.92) on a scale of 1-5, even rivaling the performance of human-written questions.

* Accepted at the 11th Joint Conference on Lexical and Computational Semantics (*SEM) Conference, NAACL 2022 
Viaarxiv icon

Figurative Language in Recognizing Textual Entailment

Jun 03, 2021
Tuhin Chakrabarty, Debanjan Ghosh, Adam Poliak, Smaranda Muresan

Figure 1 for Figurative Language in Recognizing Textual Entailment
Figure 2 for Figurative Language in Recognizing Textual Entailment
Figure 3 for Figurative Language in Recognizing Textual Entailment
Figure 4 for Figurative Language in Recognizing Textual Entailment

We introduce a collection of recognizing textual entailment (RTE) datasets focused on figurative language. We leverage five existing datasets annotated for a variety of figurative language -- simile, metaphor, and irony -- and frame them into over 12,500 RTE examples.We evaluate how well state-of-the-art models trained on popular RTE datasets capture different aspects of figurative language. Our results and analyses indicate that these models might not sufficiently capture figurative language, struggling to perform pragmatic inference and reasoning about world knowledge. Ultimately, our datasets provide a challenging testbed for evaluating RTE models.

* ACL 2021 (Findings) 
Viaarxiv icon

"Sharks are not the threat humans are": Argument Component Segmentation in School Student Essays

Mar 08, 2021
Tariq Alhindi, Debanjan Ghosh

Figure 1 for "Sharks are not the threat humans are": Argument Component Segmentation in School Student Essays
Figure 2 for "Sharks are not the threat humans are": Argument Component Segmentation in School Student Essays
Figure 3 for "Sharks are not the threat humans are": Argument Component Segmentation in School Student Essays
Figure 4 for "Sharks are not the threat humans are": Argument Component Segmentation in School Student Essays

Argument mining is often addressed by a pipeline method where segmentation of text into argumentative units is conducted first and proceeded by an argument component identification task. In this research, we apply a token-level classification to identify claim and premise tokens from a new corpus of argumentative essays written by middle school students. To this end, we compare a variety of state-of-the-art models such as discrete features and deep learning architectures (e.g., BiLSTM networks and BERT-based architectures) to identify the argument components. We demonstrate that a BERT-based multi-task learning architecture (i.e., token and sentence level classification) adaptively pretrained on a relevant unlabeled dataset obtains the best results

* Accepted to the 16th Workshop on Innovative Use of NLP for Building Educational Applications. Co-located with EACL 2021 
Viaarxiv icon

"Laughing at you or with you": The Role of Sarcasm in Shaping the Disagreement Space

Jan 26, 2021
Debanjan Ghosh, Ritvik Shrivastava, Smaranda Muresan

Figure 1 for "Laughing at you or with you": The Role of Sarcasm in Shaping the Disagreement Space
Figure 2 for "Laughing at you or with you": The Role of Sarcasm in Shaping the Disagreement Space
Figure 3 for "Laughing at you or with you": The Role of Sarcasm in Shaping the Disagreement Space
Figure 4 for "Laughing at you or with you": The Role of Sarcasm in Shaping the Disagreement Space

Detecting arguments in online interactions is useful to understand how conflicts arise and get resolved. Users often use figurative language, such as sarcasm, either as persuasive devices or to attack the opponent by an ad hominem argument. To further our understanding of the role of sarcasm in shaping the disagreement space, we present a thorough experimental setup using a corpus annotated with both argumentative moves (agree/disagree) and sarcasm. We exploit joint modeling in terms of (a) applying discrete features that are useful in detecting sarcasm to the task of argumentative relation classification (agree/disagree/none), and (b) multitask learning for argumentative relation classification and sarcasm detection using deep learning architectures (e.g., dual Long Short-Term Memory (LSTM) with hierarchical attention and Transformer-based architectures). We demonstrate that modeling sarcasm improves the argumentative relation classification task (agree/disagree/none) in all setups.

* Accepted in the 16th conference of the European Chapter of the Association for Computational Linguistics (EACL). Long paper 
Viaarxiv icon

An Exploratory Study of Argumentative Writing by Young Students: A Transformer-based Approach

Jun 17, 2020
Debanjan Ghosh, Beata Beigman Klebanov, Yi Song

Figure 1 for An Exploratory Study of Argumentative Writing by Young Students: A Transformer-based Approach
Figure 2 for An Exploratory Study of Argumentative Writing by Young Students: A Transformer-based Approach
Figure 3 for An Exploratory Study of Argumentative Writing by Young Students: A Transformer-based Approach

We present a computational exploration of argument critique writing by young students. Middle school students were asked to criticize an argument presented in the prompt, focusing on identifying and explaining the reasoning flaws. This task resembles an established college-level argument critique task. Lexical and discourse features that utilize detailed domain knowledge to identify critiques exist for the college task but do not perform well on the young students data. Instead, transformer-based architecture (e.g., BERT) fine-tuned on a large corpus of critique essays from the college task performs much better (over 20% improvement in F1 score). Analysis of the performance of various configurations of the system suggests that while children's writing does not exhibit the standard discourse structure of an argumentative essay, it does share basic local sequential structures with the more mature writers.

* 15th Workshop on Innovative Use of NLP for Building Educational Applications, ACL 2020 
Viaarxiv icon

A Report on the 2020 Sarcasm Detection Shared Task

Jun 04, 2020
Debanjan Ghosh, Avijit Vajpayee, Smaranda Muresan

Figure 1 for A Report on the 2020 Sarcasm Detection Shared Task
Figure 2 for A Report on the 2020 Sarcasm Detection Shared Task
Figure 3 for A Report on the 2020 Sarcasm Detection Shared Task
Figure 4 for A Report on the 2020 Sarcasm Detection Shared Task

Detecting sarcasm and verbal irony is critical for understanding people's actual sentiments and beliefs. Thus, the field of sarcasm analysis has become a popular research problem in natural language processing. As the community working on computational approaches for sarcasm detection is growing, it is imperative to conduct benchmarking studies to analyze the current state-of-the-art, facilitating progress in this area. We report on the shared task on sarcasm detection we conducted as a part of the 2nd Workshop on Figurative Language Processing (FigLang 2020) at ACL 2020.

* 2nd Workshop on Figurative Language Processing (FigLang2020) at ACL 2020 
Viaarxiv icon