Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Eduardo Blanco

Identifying and Answering Questions with False Assumptions: An Interpretable Approach

Aug 21, 2025

Zijie Wang, Eduardo Blanco

Abstract:People often ask questions with false assumptions, a type of question that does not have regular answers. Answering such questions require first identifying the false assumptions. Large Language Models (LLMs) often generate misleading answers because of hallucinations. In this paper, we focus on identifying and answering questions with false assumptions in several domains. We first investigate to reduce the problem to fact verification. Then, we present an approach leveraging external evidence to mitigate hallucinations. Experiments with five LLMs demonstrate that (1) incorporating retrieved evidence is beneficial and (2) generating and validating atomic assumptions yields more improvements and provides an interpretable answer by specifying the false assumptions.

* To appear at EMNLP 2025 Main conference

Via

Access Paper or Ask Questions

Fane at SemEval-2025 Task 10: Zero-Shot Entity Framing with Large Language Models

Apr 29, 2025

Enfa Fane, Mihai Surdeanu, Eduardo Blanco, Steven R. Corman

Abstract:Understanding how news narratives frame entities is crucial for studying media's impact on societal perceptions of events. In this paper, we evaluate the zero-shot capabilities of large language models (LLMs) in classifying framing roles. Through systematic experimentation, we assess the effects of input context, prompting strategies, and task decomposition. Our findings show that a hierarchical approach of first identifying broad roles and then fine-grained roles, outperforms single-step classification. We also demonstrate that optimal input contexts and prompts vary across task levels, highlighting the need for subtask-specific strategies. We achieve a Main Role Accuracy of 89.4% and an Exact Match Ratio of 34.5%, demonstrating the effectiveness of our approach. Our findings emphasize the importance of tailored prompt design and input context optimization for improving LLM performance in entity framing.

* Accepted to The 19th International Workshop on Semantic Evaluation (Semeval 2025)

Via

Access Paper or Ask Questions

Making Language Models Robust Against Negation

Feb 11, 2025

MohammadHossein Rezaei, Eduardo Blanco

Figure 1 for Making Language Models Robust Against Negation

Figure 2 for Making Language Models Robust Against Negation

Figure 3 for Making Language Models Robust Against Negation

Figure 4 for Making Language Models Robust Against Negation

Abstract:Negation has been a long-standing challenge for language models. Previous studies have shown that they struggle with negation in many natural language understanding tasks. In this work, we propose a self-supervised method to make language models more robust against negation. We introduce a novel task, Next Sentence Polarity Prediction (NSPP), and a variation of the Next Sentence Prediction (NSP) task. We show that BERT and RoBERTa further pre-trained on our tasks outperform the off-the-shelf versions on nine negation-related benchmarks. Most notably, our pre-training tasks yield between 1.8% and 9.1% improvement on CondaQA, a large question-answering corpus requiring reasoning over negation.

* Accepted to NAACL 2025

Via

Access Paper or Ask Questions

Echoes of Discord: Forecasting Hater Reactions to Counterspeech

Jan 27, 2025

Xiaoying Song, Sharon Lisseth Perez, Xinchen Yu, Eduardo Blanco, Lingzi Hong

Figure 1 for Echoes of Discord: Forecasting Hater Reactions to Counterspeech

Figure 2 for Echoes of Discord: Forecasting Hater Reactions to Counterspeech

Figure 3 for Echoes of Discord: Forecasting Hater Reactions to Counterspeech

Figure 4 for Echoes of Discord: Forecasting Hater Reactions to Counterspeech

Abstract:Hate speech (HS) erodes the inclusiveness of online users and propagates negativity and division. Counterspeech has been recognized as a way to mitigate the harmful consequences. While some research has investigated the impact of user-generated counterspeech on social media platforms, few have examined and modeled haters' reactions toward counterspeech, despite the immediate alteration of haters' attitudes being an important aspect of counterspeech. This study fills the gap by analyzing the impact of counterspeech from the hater's perspective, focusing on whether the counterspeech leads the hater to reenter the conversation and if the reentry is hateful. We compile the Reddit Echoes of Hate dataset (ReEco), which consists of triple-turn conversations featuring haters' reactions, to assess the impact of counterspeech. The linguistic analysis sheds insights on the language of counterspeech to hate eliciting different haters' reactions. Experimental results demonstrate that the 3-way classification model outperforms the two-stage reaction predictor, which first predicts reentry and then determines the reentry type. We conclude the study with an assessment showing the most common errors identified by the best-performing model.

Via

Access Paper or Ask Questions

Assessing the Human Likeness of AI-Generated Counterspeech

Oct 14, 2024

Xiaoying Song, Sujana Mamidisetty, Eduardo Blanco, Lingzi Hong

Figure 1 for Assessing the Human Likeness of AI-Generated Counterspeech

Figure 2 for Assessing the Human Likeness of AI-Generated Counterspeech

Figure 3 for Assessing the Human Likeness of AI-Generated Counterspeech

Figure 4 for Assessing the Human Likeness of AI-Generated Counterspeech

Abstract:Counterspeech is a targeted response to counteract and challenge abusive or hateful content. It can effectively curb the spread of hatred and foster constructive online communication. Previous studies have proposed different strategies for automatically generated counterspeech. Evaluations, however, focus on the relevance, surface form, and other shallow linguistic characteristics. In this paper, we investigate the human likeness of AI-generated counterspeech, a critical factor influencing effectiveness. We implement and evaluate several LLM-based generation strategies, and discover that AI-generated and human-written counterspeech can be easily distinguished by both simple classifiers and humans. Further, we reveal differences in linguistic characteristics, politeness, and specificity.

Via

Access Paper or Ask Questions

Memorization In In-Context Learning

Aug 21, 2024

Shahriar Golchin, Mihai Surdeanu, Steven Bethard, Eduardo Blanco, Ellen Riloff

Figure 1 for Memorization In In-Context Learning

Figure 2 for Memorization In In-Context Learning

Figure 3 for Memorization In In-Context Learning

Figure 4 for Memorization In In-Context Learning

Abstract:In-context learning (ICL) has proven to be an effective strategy for improving the performance of large language models (LLMs) with no additional training. However, the exact mechanism behind these performance improvements remains unclear. This study is the first to show how ICL surfaces memorized training data and to explore the correlation between this memorization and performance across various ICL regimes: zero-shot, few-shot, and many-shot. Our most notable findings include: (1) ICL significantly surfaces memorization compared to zero-shot learning in most cases; (2) demonstrations, without their labels, are the most effective element in surfacing memorization; (3) ICL improves performance when the surfaced memorization in few-shot regimes reaches a high level (about 40%); and (4) there is a very strong correlation between performance and memorization in ICL when it outperforms zero-shot learning. Overall, our study uncovers a hidden phenomenon -- memorization -- at the core of ICL, raising an important question: to what extent do LLMs truly generalize from demonstrations in ICL, and how much of their success is due to memorization?

* v1

Via

Access Paper or Ask Questions

UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization

Jul 03, 2024

Md Nayem Uddin, Amir Saeidi, Divij Handa, Agastya Seth, Tran Cao Son, Eduardo Blanco, Steven R. Corman, Chitta Baral

Figure 1 for UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization

Figure 2 for UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization

Figure 3 for UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization

Figure 4 for UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization

Abstract:This paper introduces UnSeenTimeQA, a novel time-sensitive question-answering (TSQA) benchmark that diverges from traditional TSQA benchmarks by avoiding factual and web-searchable queries. We present a series of time-sensitive event scenarios decoupled from real-world factual information. It requires large language models (LLMs) to engage in genuine temporal reasoning, disassociating from the knowledge acquired during the pre-training phase. Our evaluation of six open-source LLMs (ranging from 2B to 70B in size) and three closed-source LLMs reveal that the questions from the UnSeenTimeQA present substantial challenges. This indicates the models' difficulties in handling complex temporal reasoning scenarios. Additionally, we present several analyses shedding light on the models' performance in answering time-sensitive questions.

Via

Access Paper or Ask Questions

LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Jun 25, 2024

Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath(+30 more)

Figure 1 for LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Figure 2 for LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Figure 3 for LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Figure 4 for LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Abstract:This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as they have to spend more time reading, writing, and reviewing papers. This raises the question: how can LLMs potentially assist researchers in alleviating their heavy workload? This study focuses on the topic of LLMs assist NLP Researchers, particularly examining the effectiveness of LLM in assisting paper (meta-)reviewing and its recognizability. To address this, we constructed the ReviewCritique dataset, which includes two types of information: (i) NLP papers (initial submissions rather than camera-ready) with both human-written and LLM-generated reviews, and (ii) each review comes with "deficiency" labels and corresponding explanations for individual segments, annotated by experts. Using ReviewCritique, this study explores two threads of research questions: (i) "LLMs as Reviewers", how do reviews generated by LLMs compare with those written by humans in terms of quality and distinguishability? (ii) "LLMs as Metareviewers", how effectively can LLMs identify potential issues, such as Deficient or unprofessional review segments, within individual paper reviews? To our knowledge, this is the first work to provide such a comprehensive analysis.

Via

Access Paper or Ask Questions

Paraphrasing in Affirmative Terms Improves Negation Understanding

Jun 11, 2024

MohammadHossein Rezaei, Eduardo Blanco

Figure 1 for Paraphrasing in Affirmative Terms Improves Negation Understanding

Figure 2 for Paraphrasing in Affirmative Terms Improves Negation Understanding

Figure 3 for Paraphrasing in Affirmative Terms Improves Negation Understanding

Figure 4 for Paraphrasing in Affirmative Terms Improves Negation Understanding

Abstract:Negation is a common linguistic phenomenon. Yet language models face challenges with negation in many natural language understanding tasks such as question answering and natural language inference. In this paper, we experiment with seamless strategies that incorporate affirmative interpretations (i.e., paraphrases without negation) to make models more robust against negation. Crucially, our affirmative interpretations are obtained automatically. We show improvements with CondaQA, a large corpus requiring reasoning with negation, and five natural language understanding tasks.

* Accepted to ACL 2024

Via

Access Paper or Ask Questions

Asking and Answering Questions to Extract Event-Argument Structures

Apr 25, 2024

Md Nayem Uddin, Enfa Rose George, Eduardo Blanco, Steven Corman

Abstract:This paper presents a question-answering approach to extract document-level event-argument structures. We automatically ask and answer questions for each argument type an event may have. Questions are generated using manually defined templates and generative transformers. Template-based questions are generated using predefined role-specific wh-words and event triggers from the context document. Transformer-based questions are generated using large language models trained to formulate questions based on a passage and the expected answer. Additionally, we develop novel data augmentation strategies specialized in inter-sentential event-argument relations. We use a simple span-swapping technique, coreference resolution, and large language models to augment the training instances. Our approach enables transfer learning without any corpora-specific modifications and yields competitive results with the RAMS dataset. It outperforms previous work, and it is especially beneficial to extract arguments that appear in different sentences than the event trigger. We also present detailed quantitative and qualitative analyses shedding light on the most common errors made by our best model.

* Accepted at LREC-COLING 2024

Via

Access Paper or Ask Questions