Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Smaranda Muresan

Large Language Models are Few-Shot Training Example Generators: A Case Study in Fallacy Recognition

Nov 16, 2023

Tariq Alhindi, Smaranda Muresan, Preslav Nakov

Abstract:Recognizing fallacies is crucial for ensuring the quality and validity of arguments across various domains. However, computational fallacy recognition faces challenges due to the diverse genres, domains, and types of fallacies found in datasets. This leads to a highly multiclass, and even multi-label, setup with substantial class imbalance. In this study, we aim to enhance existing models for fallacy recognition by incorporating additional context and by leveraging large language models to generate synthetic data, thus increasing the representation of the infrequent classes. We experiment with GPT3.5 to generate synthetic examples and we examine the impact of prompt settings for this. Moreover, we explore zero-shot and few-shot scenarios to evaluate the effectiveness of using the generated examples for training smaller models within a unified fallacy recognition framework. Furthermore, we analyze the overlap between the synthetic data and existing fallacy datasets. Finally, we investigate the usefulness of providing supplementary context for detecting fallacy types that need such context, e.g., diversion fallacies. Our evaluation results demonstrate consistent improvements across fallacy types, datasets, and generators.

Via

Access Paper or Ask Questions

Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts

Nov 15, 2023

Chenghao Yang, Tuhin Chakrabarty, Karli R Hochstatter, Melissa N Slavin, Nabila El-Bassel, Smaranda Muresan

Figure 1 for Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts

Figure 2 for Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts

Figure 3 for Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts

Figure 4 for Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts

Abstract:In the last decade, the United States has lost more than 500,000 people from an overdose involving prescription and illicit opioids (https://www.cdc.gov/drugoverdose/epidemic/index.html) making it a national public health emergency (USDHHS, 2017). To more effectively prevent unintentional opioid overdoses, medical practitioners require robust and timely tools that can effectively identify at-risk patients. Community-based social media platforms such as Reddit allow self-disclosure for users to discuss otherwise sensitive drug-related behaviors, often acting as indicators for opioid use disorder. Towards this, we present a moderate size corpus of 2500 opioid-related posts from various subreddits spanning 6 different phases of opioid use: Medical Use, Misuse, Addiction, Recovery, Relapse, Not Using. For every post, we annotate span-level extractive explanations and crucially study their role both in annotation quality and model development. We evaluate several state-of-the-art models in a supervised, few-shot, or zero-shot setting. Experimental results and error analysis show that identifying the phases of opioid use disorder is highly contextual and challenging. However, we find that using explanations during modeling leads to a significant boost in classification accuracy demonstrating their beneficial role in a high-stakes domain such as studying the opioid use disorder continuum. The dataset will be made available for research on Github in the formal version.

* Work in progress

Via

Access Paper or Ask Questions

Learning to Follow Object-Centric Image Editing Instructions Faithfully

Oct 29, 2023

Tuhin Chakrabarty, Kanishk Singh, Arkadiy Saakyan, Smaranda Muresan

Abstract:Natural language instructions are a powerful interface for editing the outputs of text-to-image diffusion models. However, several challenges need to be addressed: 1) underspecification (the need to model the implicit meaning of instructions) 2) grounding (the need to localize where the edit has to be performed), 3) faithfulness (the need to preserve the elements of the image not affected by the edit instruction). Current approaches focusing on image editing with natural language instructions rely on automatically generated paired data, which, as shown in our investigation, is noisy and sometimes nonsensical, exacerbating the above issues. Building on recent advances in segmentation, Chain-of-Thought prompting, and visual question answering, we significantly improve the quality of the paired data. In addition, we enhance the supervision signal by highlighting parts of the image that need to be changed by the instruction. The model fine-tuned on the improved data is capable of performing fine-grained object-centric edits better than state-of-the-art baselines, mitigating the problems outlined above, as shown by automatic and human evaluations. Moreover, our model is capable of generalizing to domains unseen during training, such as visual metaphors.

* Findings of EMNLP 2023 (Long paper)

Via

Access Paper or Ask Questions

NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling Social Norm Adherence and Violation

Oct 25, 2023

Oliver Li, Mallika Subramanian, Arkadiy Saakyan, Sky CH-Wang, Smaranda Muresan

Figure 1 for NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling Social Norm Adherence and Violation

Figure 2 for NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling Social Norm Adherence and Violation

Figure 3 for NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling Social Norm Adherence and Violation

Figure 4 for NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling Social Norm Adherence and Violation

Abstract:Social norms fundamentally shape interpersonal communication. We present NormDial, a high-quality dyadic dialogue dataset with turn-by-turn annotations of social norm adherences and violations for Chinese and American cultures. Introducing the task of social norm observance detection, our dataset is synthetically generated in both Chinese and English using a human-in-the-loop pipeline by prompting large language models with a small collection of expert-annotated social norms. We show that our generated dialogues are of high quality through human evaluation and further evaluate the performance of existing large language models on this task. Our findings point towards new directions for understanding the nuances of social norms as they manifest in conversational contexts that span across languages and cultures.

* EMNLP 2023 Main Conference, Short Paper; Data at https://github.com/Aochong-Li/NormDial

Via

Access Paper or Ask Questions

Art or Artifice? Large Language Models and the False Promise of Creativity

Sep 25, 2023

Tuhin Chakrabarty, Philippe Laban, Divyansh Agarwal, Smaranda Muresan, Chien-Sheng Wu

Abstract:Researchers have argued that large language models (LLMs) exhibit high-quality writing capabilities from blogs to stories. However, evaluating objectively the creativity of a piece of writing is challenging. Inspired by the Torrance Test of Creative Thinking (TTCT), which measures creativity as a process, we use the Consensual Assessment Technique [3] and propose the Torrance Test of Creative Writing (TTCW) to evaluate creativity as a product. TTCW consists of 14 binary tests organized into the original dimensions of Fluency, Flexibility, Originality, and Elaboration. We recruit 10 creative writers and implement a human assessment of 48 stories written either by professional authors or LLMs using TTCW. Our analysis shows that LLM-generated stories pass 3-10X less TTCW tests than stories written by professionals. In addition, we explore the use of LLMs as assessors to automate the TTCW evaluation, revealing that none of the LLMs positively correlate with the expert assessments.

Via

Access Paper or Ask Questions

Creativity Support in the Age of Large Language Models: An Empirical Study Involving Emerging Writers

Sep 25, 2023

Tuhin Chakrabarty, Vishakh Padmakumar, Faeze Brahman, Smaranda Muresan

Figure 1 for Creativity Support in the Age of Large Language Models: An Empirical Study Involving Emerging Writers

Figure 2 for Creativity Support in the Age of Large Language Models: An Empirical Study Involving Emerging Writers

Figure 3 for Creativity Support in the Age of Large Language Models: An Empirical Study Involving Emerging Writers

Figure 4 for Creativity Support in the Age of Large Language Models: An Empirical Study Involving Emerging Writers

Abstract:The development of large language models (LLMs) capable of following instructions and engaging in conversational interactions sparked increased interest in their utilization across various support tools. We investigate the utility of modern LLMs in assisting professional writers via an empirical user study (n=30). The design of our collaborative writing interface is grounded in the cognitive process model of writing that views writing as a goal-oriented thinking process encompassing non-linear cognitive activities: planning, translating, and reviewing. Participants are asked to submit a post-completion survey to provide feedback on the potential and pitfalls of LLMs as writing collaborators. Upon analyzing the writer-LLM interactions, we find that while writers seek LLM's help across all three types of cognitive activities, they find LLMs more helpful in translation and reviewing. Our findings from analyzing both the interactions and the survey responses highlight future research directions in creative writing assistance using LLMs.

Via

Access Paper or Ask Questions

ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer

Sep 15, 2023

Arkadiy Saakyan, Smaranda Muresan

Figure 1 for ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer

Figure 2 for ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer

Figure 3 for ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer

Figure 4 for ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer

Abstract:While state-of-the-art language models excel at the style transfer task, current work does not address explainability of style transfer systems. Explanations could be generated using large language models such as GPT-3.5 and GPT-4, but the use of such complex systems is inefficient when smaller, widely distributed, and transparent alternatives are available. We propose a framework to augment and improve a formality style transfer dataset with explanations via model distillation from ChatGPT. To further refine the generated explanations, we propose a novel way to incorporate scarce expert human feedback using in-context learning (ICLEF: In-Context Learning from Expert Feedback) by prompting ChatGPT to act as a critic to its own outputs. We use the resulting dataset of 9,960 explainable formality style transfer instances (e-GYAFC) to show that current openly distributed instruction-tuned models (and, in some settings, ChatGPT) perform poorly on the task, and that fine-tuning on our high-quality dataset leads to significant improvements as shown by automatic evaluation. In human evaluation, we show that models much smaller than ChatGPT fine-tuned on our data align better with expert preferences. Finally, we discuss two potential applications of models fine-tuned on the explainable style transfer task: interpretable authorship verification and interpretable adversarial attacks on AI-generated text detectors.

Via

Access Paper or Ask Questions

I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors

May 24, 2023

Tuhin Chakrabarty, Arkadiy Saakyan, Olivia Winn, Artemis Panagopoulou, Yue Yang, Marianna Apidianaki, Smaranda Muresan

Figure 1 for I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors

Figure 2 for I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors

Figure 3 for I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors

Figure 4 for I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors

Abstract:Visual metaphors are powerful rhetorical devices used to persuade or communicate creative ideas through images. Similar to linguistic metaphors, they convey meaning implicitly through symbolism and juxtaposition of the symbols. We propose a new task of generating visual metaphors from linguistic metaphors. This is a challenging task for diffusion-based text-to-image models, such as DALL$\cdot$E 2, since it requires the ability to model implicit meaning and compositionality. We propose to solve the task through the collaboration between Large Language Models (LLMs) and Diffusion Models: Instruct GPT-3 (davinci-002) with Chain-of-Thought prompting generates text that represents a visual elaboration of the linguistic metaphor containing the implicit meaning and relevant objects, which is then used as input to the diffusion-based text-to-image models.Using a human-AI collaboration framework, where humans interact both with the LLM and the top-performing diffusion model, we create a high-quality dataset containing 6,476 visual metaphors for 1,540 linguistic metaphors and their associated visual elaborations. Evaluation by professional illustrators shows the promise of LLM-Diffusion Model collaboration for this task.To evaluate the utility of our Human-AI collaboration framework and the quality of our dataset, we perform both an intrinsic human-based evaluation and an extrinsic evaluation using visual entailment as a downstream task.

* ACL 2023 (Findings)

Via

Access Paper or Ask Questions

Sociocultural Norm Similarities and Differences via Situational Alignment and Explainable Textual Entailment

May 23, 2023

Sky CH-Wang, Arkadiy Saakyan, Oliver Li, Zhou Yu, Smaranda Muresan

Figure 1 for Sociocultural Norm Similarities and Differences via Situational Alignment and Explainable Textual Entailment

Figure 2 for Sociocultural Norm Similarities and Differences via Situational Alignment and Explainable Textual Entailment

Figure 3 for Sociocultural Norm Similarities and Differences via Situational Alignment and Explainable Textual Entailment

Figure 4 for Sociocultural Norm Similarities and Differences via Situational Alignment and Explainable Textual Entailment

Abstract:Designing systems that can reason across cultures requires that they are grounded in the norms of the contexts in which they operate. However, current research on developing computational models of social norms has primarily focused on American society. Here, we propose a novel approach to discover and compare descriptive social norms across Chinese and American cultures. We demonstrate our approach by leveraging discussions on a Chinese Q&A platform-Zhihu-and the existing SocialChemistry dataset as proxies for contrasting cultural axes, align social situations cross-culturally, and extract social norms from texts using in-context learning. Embedding Chain-of-Thought prompting in a human-AI collaborative framework, we build a high-quality dataset of 3,069 social norms aligned with social situations across Chinese and American cultures alongside corresponding free-text explanations. To test the ability of models to reason about social norms across cultures, we introduce the task of explainable social norm entailment, showing that existing models under 3B parameters have significant room for improvement in both automatic and human evaluation. Further analysis of cross-cultural norm differences based on our dataset shows empirical alignment with the social orientations framework, revealing several situational and descriptive nuances in norms across these cultures.

Via

Access Paper or Ask Questions

A Weak Supervision Approach for Few-Shot Aspect Based Sentiment

May 19, 2023

Robert Vacareanu, Siddharth Varia, Kishaloy Halder, Shuai Wang, Giovanni Paolini, Neha Anna John, Miguel Ballesteros, Smaranda Muresan

Figure 1 for A Weak Supervision Approach for Few-Shot Aspect Based Sentiment

Figure 2 for A Weak Supervision Approach for Few-Shot Aspect Based Sentiment

Figure 3 for A Weak Supervision Approach for Few-Shot Aspect Based Sentiment

Figure 4 for A Weak Supervision Approach for Few-Shot Aspect Based Sentiment

Abstract:We explore how weak supervision on abundant unlabeled data can be leveraged to improve few-shot performance in aspect-based sentiment analysis (ABSA) tasks. We propose a pipeline approach to construct a noisy ABSA dataset, and we use it to adapt a pre-trained sequence-to-sequence model to the ABSA tasks. We test the resulting model on three widely used ABSA datasets, before and after fine-tuning. Our proposed method preserves the full fine-tuning performance while showing significant improvements (15.84% absolute F1) in the few-shot learning scenario for the harder tasks. In zero-shot (i.e., without fine-tuning), our method outperforms the previous state of the art on the aspect extraction sentiment classification (AESC) task and is, additionally, capable of performing the harder aspect sentiment triplet extraction (ASTE) task.

Via

Access Paper or Ask Questions