Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chris Biemann

Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding

Mar 27, 2024

Xintong Wang, Jingheng Pan, Liang Ding, Chris Biemann

Abstract:Large Vision-Language Models (LVLMs) are increasingly adept at generating contextually detailed and coherent responses from visual inputs. However, their application in multimodal decision-making and open-ended generation is hindered by a notable rate of hallucinations, where generated text inaccurately represents the visual contents. To address this issue, this paper introduces the Instruction Contrastive Decoding (ICD) method, a novel approach designed to reduce hallucinations during LVLM inference. Our method is inspired by our observation that what we call disturbance instructions significantly exacerbate hallucinations in multimodal fusion modules. ICD contrasts distributions from standard and instruction disturbance, thereby increasing alignment uncertainty and effectively subtracting hallucinated concepts from the original distribution. Through comprehensive experiments on discriminative benchmarks (POPE and MME) and a generative benchmark (LLaVa-Bench), we demonstrate that ICD significantly mitigates both object-level and attribute-level hallucinations. Moreover, our method not only addresses hallucinations but also significantly enhances the general perception and recognition capabilities of LVLMs.

Via

Access Paper or Ask Questions

On Zero-Shot Counterspeech Generation by LLMs

Mar 22, 2024

Punyajoy Saha, Aalok Agrawal, Abhik Jana, Chris Biemann, Animesh Mukherjee

Abstract:With the emergence of numerous Large Language Models (LLM), the usage of such models in various Natural Language Processing (NLP) applications is increasing extensively. Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech - counterspeech pairs, but none of these attempts explores the intrinsic properties of large language models in zero-shot settings. In this work, we present a comprehensive analysis of the performances of four LLMs namely GPT-2, DialoGPT, ChatGPT and FlanT5 in zero-shot settings for counterspeech generation, which is the first of its kind. For GPT-2 and DialoGPT, we further investigate the deviation in performance with respect to the sizes (small, medium, large) of the models. On the other hand, we propose three different prompting strategies for generating different types of counterspeech and analyse the impact of such strategies on the performance of the models. Our analysis shows that there is an improvement in generation quality for two datasets (17%), however the toxicity increase (25%) with increase in model size. Considering type of model, GPT-2 and FlanT5 models are significantly better in terms of counterspeech quality but also have high toxicity as compared to DialoGPT. ChatGPT are much better at generating counter speech than other models across all metrics. In terms of prompting, we find that our proposed strategies help in improving counter speech generation across all the models.

* 12 pages, 7 tables, accepted at LREC-COLING 2024

Via

Access Paper or Ask Questions

SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages

Feb 15, 2024

Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani(+17 more)

Figure 1 for SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages

Figure 2 for SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages

Figure 3 for SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages

Figure 4 for SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages

Abstract:Exploring and quantifying semantic relatedness is central to representing language. It holds significant implications across various NLP tasks, including offering insights into the capabilities and performance of Large Language Models (LLMs). While earlier NLP research primarily focused on semantic similarity, often within the English language context, we instead investigate the broader phenomenon of semantic relatedness. In this paper, we present SemRel, a new semantic relatedness dataset collection annotated by native speakers across 14 languages:Afrikaans, Algerian Arabic, Amharic, English, Hausa, Hindi, Indonesian, Kinyarwanda, Marathi, Moroccan Arabic, Modern Standard Arabic, Punjabi, Spanish, and Telugu. These languages originate from five distinct language families and are predominantly spoken in Africa and Asia -- regions characterised by a relatively limited availability of NLP resources. Each instance in the SemRel datasets is a sentence pair associated with a score that represents the degree of semantic textual relatedness between the two sentences. The scores are obtained using a comparative annotation framework. We describe the data collection and annotation processes, related challenges when building the datasets, and their impact and utility in NLP. We further report experiments for each language and across the different languages.

* 18 pages

Via

Access Paper or Ask Questions

Probing Language Models from A Human Behavioral Perspective

Oct 08, 2023

Xintong Wang, Xiaoyu Li, Xingshan Li, Chris Biemann

Figure 1 for Probing Language Models from A Human Behavioral Perspective

Figure 2 for Probing Language Models from A Human Behavioral Perspective

Figure 3 for Probing Language Models from A Human Behavioral Perspective

Figure 4 for Probing Language Models from A Human Behavioral Perspective

Abstract:Large Language Models (LLMs) have emerged as dominant foundational models in modern NLP. However, the understanding of their prediction process and internal mechanisms, such as feed-forward networks and multi-head self-attention, remains largely unexplored. In this study, we probe LLMs from a human behavioral perspective, correlating values from LLMs with eye-tracking measures, which are widely recognized as meaningful indicators of reading patterns. Our findings reveal that LLMs exhibit a prediction pattern distinct from that of RNN-based LMs. Moreover, with the escalation of FFN layers, the capacity for memorization and linguistic knowledge encoding also surges until it peaks, subsequently pivoting to focus on comprehension capacity. The functions of self-attention are distributed across multiple heads. Lastly, we scrutinize the gate mechanisms, finding that they control the flow of information, with some gates promoting, while others eliminating information.

Via

Access Paper or Ask Questions

DBLPLink: An Entity Linker for the DBLP Scholarly Knowledge Graph

Sep 25, 2023

Debayan Banerjee, Arefa, Ricardo Usbeck, Chris Biemann

Abstract:In this work, we present a web application named DBLPLink, which performs entity linking over the DBLP scholarly knowledge graph. DBLPLink uses text-to-text pre-trained language models, such as T5, to produce entity label spans from an input text question. Entity candidates are fetched from a database based on the labels, and an entity re-ranker sorts them based on entity embeddings, such as TransE, DistMult and ComplEx. The results are displayed so that users may compare and contrast the results between T5-small, T5-base and the different KG embeddings used. The demo can be accessed at https://ltdemos.informatik.uni-hamburg.de/dblplink/.

* Accepted at International Semantic Web Conference (ISWC) 2023 Posters & Demo Track

Via

Access Paper or Ask Questions

The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing

May 24, 2023

Debayan Banerjee, Pranav Ajit Nair, Ricardo Usbeck, Chris Biemann

Figure 1 for The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing

Figure 2 for The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing

Figure 3 for The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing

Figure 4 for The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing

Abstract:In this work, we analyse the role of output vocabulary for text-to-text (T2T) models on the task of SPARQL semantic parsing. We perform experiments within the the context of knowledge graph question answering (KGQA), where the task is to convert questions in natural language to the SPARQL query language. We observe that the query vocabulary is distinct from human vocabulary. Language Models (LMs) are pre-dominantly trained for human language tasks, and hence, if the query vocabulary is replaced with a vocabulary more attuned to the LM tokenizer, the performance of models may improve. We carry out carefully selected vocabulary substitutions on the queries and find absolute gains in the range of 17% on the GrailQA dataset.

* Accepted as a short paper to ACL 2023 findings

Via

Access Paper or Ask Questions

DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph

Mar 29, 2023

Debayan Banerjee, Sushil Awale, Ricardo Usbeck, Chris Biemann

Figure 1 for DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph

Figure 2 for DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph

Figure 3 for DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph

Figure 4 for DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph

Abstract:In this work we create a question answering dataset over the DBLP scholarly knowledge graph (KG). DBLP is an on-line reference for bibliographic information on major computer science publications that indexes over 4.4 million publications published by more than 2.2 million authors. Our dataset consists of 10,000 question answer pairs with the corresponding SPARQL queries which can be executed over the DBLP KG to fetch the correct answer. DBLP-QuAD is the largest scholarly question answering dataset.

* 12 pages ceur-ws 1 column accepted at International Bibliometric Information Retrieval Workshp @ ECIR 2023

Via

Access Paper or Ask Questions

GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph Question Answering

Mar 28, 2023

Debayan Banerjee, Pranav Ajit Nair, Ricardo Usbeck, Chris Biemann

Figure 1 for GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph Question Answering

Figure 2 for GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph Question Answering

Figure 3 for GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph Question Answering

Figure 4 for GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph Question Answering

Abstract:In this work, we present an end-to-end Knowledge Graph Question Answering (KGQA) system named GETT-QA. GETT-QA uses T5, a popular text-to-text pre-trained language model. The model takes a question in natural language as input and produces a simpler form of the intended SPARQL query. In the simpler form, the model does not directly produce entity and relation IDs. Instead, it produces corresponding entity and relation labels. The labels are grounded to KG entity and relation IDs in a subsequent step. To further improve the results, we instruct the model to produce a truncated version of the KG embedding for each entity. The truncated KG embedding enables a finer search for disambiguation purposes. We find that T5 is able to learn the truncated KG embeddings without any change of loss function, improving KGQA performance. As a result, we report strong results for LC-QuAD 2.0 and SimpleQuestions-Wikidata datasets on end-to-end KGQA over Wikidata.

* 16 pages single column format accepted at ESWC 2023 research track

Via

Access Paper or Ask Questions

A System for Human-AI collaboration for Online Customer Support

Feb 07, 2023

Debayan Banerjee, Mathis Poser, Christina Wiethof, Varun Shankar Subramanian, Richard Paucar, Eva A. C. Bittner, Chris Biemann

Figure 1 for A System for Human-AI collaboration for Online Customer Support

Figure 2 for A System for Human-AI collaboration for Online Customer Support

Figure 3 for A System for Human-AI collaboration for Online Customer Support

Figure 4 for A System for Human-AI collaboration for Online Customer Support

Abstract:AI enabled chat bots have recently been put to use to answer customer service queries, however it is a common feedback of users that bots lack a personal touch and are often unable to understand the real intent of the user's question. To this end, it is desirable to have human involvement in the customer servicing process. In this work, we present a system where a human support agent collaborates in real-time with an AI agent to satisfactorily answer customer queries. We describe the user interaction elements of the solution, along with the machine learning techniques involved in the AI agent.

Via

Access Paper or Ask Questions

ARDIAS: AI-Enhanced Research Management, Discovery, and Advisory System

Jan 25, 2023

Debayan Banerjee, Seid Muhie Yimam, Sushil Awale, Chris Biemann

Figure 1 for ARDIAS: AI-Enhanced Research Management, Discovery, and Advisory System

Figure 2 for ARDIAS: AI-Enhanced Research Management, Discovery, and Advisory System

Figure 3 for ARDIAS: AI-Enhanced Research Management, Discovery, and Advisory System

Figure 4 for ARDIAS: AI-Enhanced Research Management, Discovery, and Advisory System

Abstract:In this work, we present ARDIAS, a web-based application that aims to provide researchers with a full suite of discovery and collaboration tools. ARDIAS currently allows searching for authors and articles by name and gaining insights into the research topics of a particular researcher. With the aid of AI-based tools, ARDIAS aims to recommend potential collaborators and topics to researchers. In the near future, we aim to add tools that allow researchers to communicate with each other and start new projects.

Via

Access Paper or Ask Questions