Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Christian Wolff

Prompting Is All You Need: Multi-view Prompting Large Language Models for Aspect-Based Sentiment Analysis

May 27, 2026

Nils Constantin Hellwig, Niklas Donhauser, Jakob Fehle, Udo Kruschwitz, Christian Wolff

Abstract:Recent work explored the capabilities of Large Language Models (LLMs) in Aspect-Based Sentiment Analysis (ABSA) through few-shot prompting, requiring substantially fewer annotated examples while achieving notable improvements over zero-shot baselines. However, a performance gap remained compared to models fine-tuned on hundreds of examples, and the computational costs of LLM inference present practical barriers to deployment. We introduce LLM-based Multi-View Prompting (LLM-MvP), which adapts the multi-view principle of considering multiple element orderings to LLM prompting. By combining schema-constrained decoding with a context-free grammar and prefix batching, LLM-MvP achieves performance competitive or superior to fine-tuned approaches while substantially reducing computational overhead. Extensive experiments across five benchmark datasets demonstrate that LLM-MvP closes the gap between few-shot prompting and fine-tuned models, offering a practical and efficient solution for ABSA.

Via

Access Paper or Ask Questions

Annotation Quality in Aspect-Based Sentiment Analysis: A Case Study Comparing Experts, Students, Crowdworkers, and Large Language Model

May 05, 2026

Niklas Donhauser, Jakob Fehle, Nils Constantin Hellwig, Markus Weinberger, Udo Kruschwitz, Christian Wolff

Abstract:Aspect-Based Sentiment Analysis (ABSA) enables fine-grained opinion analysis by identifying sentiments toward specific aspects or targets within a text. While ABSA has been widely studied for English, research on other languages such as German remains limited, largely due to the lack of high-quality annotated datasets. This paper examines how different annotation sources influence the development of German ABSA. To this end, an existing dataset is re-annotated by experts to establish a ground truth, which serves as a reference for evaluating annotations produced by students, crowdworkers, Large Language Models (LLMs), and experts. Annotation quality is compared using Inter-Annotator Agreement (IAA) and its impact on downstream model performance for different ABSA subtasks. The evaluation focuses on Aspect Category Sentiment Analysis (ACSA) and Target Aspect Sentiment Detection (TASD). We apply State-of-the-Art (SOTA) methods for ABSA, including BERT-, T5-, and LLaMA-based approaches to assess performance differences, spanning fine-tuning and in-context learning with instruction prompts. The findings provide practical insights into trade-offs between annotation reliability and efficiency, offering guidance for dataset construction in under-resourced Natural Language Processing (NLP) scenarios.

Via

Access Paper or Ask Questions

Seeing Candidates at Scale: Multimodal LLMs for Visual Political Communication on Instagram

Apr 21, 2026

Michael Achmann-Denkler, Mario Haim, Christian Wolff

Abstract:This paper presents a computational case study that evaluates the capabilities of specialized machine learning models and emerging multimodal large language models for Visual Political Communication (VPC) analysis. Focusing on concentrated visibility in Instagram stories and posts during the 2021 German federal election campaign, we compare the performance of traditional computer vision models (FaceNet512, RetinaFace, Google Cloud Vision) with a multimodal large language model (GPT-4o) in identifying front-runner politicians and counting individuals in images. GPT-4o outperformed the other models, achieving a macro F1-score of 0.89 for face recognition and 0.86 for person counting in stories. These findings demonstrate the potential of advanced AI systems to scale and refine visual content analysis in political communication while highlighting methodological considerations for future research.

* An earlier version was presented at #SMSociety 2024 (London)

Via

Access Paper or Ask Questions

LLM-as-an-Annotator: Training Lightweight Models with LLM-Annotated Examples for Aspect Sentiment Tuple Prediction

Mar 02, 2026

Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz, Christian Wolff

Abstract:Training models for Aspect-Based Sentiment Analysis (ABSA) tasks requires manually annotated data, which is expensive and time-consuming to obtain. This paper introduces LA-ABSA, a novel approach that leverages Large Language Model (LLM)-generated annotations to fine-tune lightweight models for complex ABSA tasks. We evaluate our approach on five datasets for Target Aspect Sentiment Detection (TASD) and Aspect Sentiment Quad Prediction (ASQP). Our approach outperformed previously reported augmentation strategies and achieved competitive performance with LLM-prompting in low-resource scenarios, while providing substantial energy efficiency benefits. For example, using 50 annotated examples for in-context learning (ICL) to guide the annotation of unlabeled data, LA-ABSA achieved an F1 score of 49.85 for ASQP on the SemEval Rest16 dataset, closely matching the performance of ICL prompting with Gemma-3-27B (51.10), while requiring significantly lower computational resources.

* Accepted for publication at LREC 2026. Final version will appear in the ACL Anthology

Via

Access Paper or Ask Questions

nchellwig at SemEval-2026 Task 3: Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis using Large Language Models

Mar 02, 2026

Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz, Christian Wolff

Abstract:We present Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis in SemEval-2026 Task 3 (Track A). SCSG enhances prediction reliability by executing a LoRA-adapted large language model multiple times per instance, retaining only tuples that achieve a majority consensus across runs. To mitigate the computational overhead of multiple forward passes, we leverage vLLM's PagedAttention mechanism for efficient key--value cache reuse. Evaluation across 6 languages and 8 language--domain combinations demonstrates that self-consistency with 15 executions yields statistically significant improvements over single-inference prompting, with our system (leveraging Gemma 3) ranking in the top seven across all settings, achieving second place on three out of four English subsets and first place on Tatar-Restaurant for DimASTE.

Via

Access Paper or Ask Questions

AnnoABSA: A Web-Based Annotation Tool for Aspect-Based Sentiment Analysis with Retrieval-Augmented Suggestions

Mar 02, 2026

Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz, Christian Wolff

Abstract:We introduce AnnoABSA, the first web-based annotation tool to support the full spectrum of Aspect-Based Sentiment Analysis (ABSA) tasks. The tool is highly customizable, enabling flexible configuration of sentiment elements and task-specific requirements. Alongside manual annotation, AnnoABSA provides optional Large Language Model (LLM)-based retrieval-augmented generation (RAG) suggestions that offer context-aware assistance in a human-in-the-loop approach, keeping the human annotator in control. To improve prediction quality over time, the system retrieves the ten most similar examples that are already annotated and adds them as few-shot examples in the prompt, ensuring that suggestions become increasingly accurate as the annotation process progresses. Released as open-source software under the MIT License, AnnoABSA is freely accessible and easily extendable for research and practical applications.

* Accepted for publication at LREC 2026. Final version will appear in the ACL Anthology

Via

Access Paper or Ask Questions

Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction

Feb 18, 2025

Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz, Christian Wolff

Abstract:Aspect sentiment quadruple prediction (ASQP) facilitates a detailed understanding of opinions expressed in a text by identifying the opinion term, aspect term, aspect category and sentiment polarity for each opinion. However, annotating a full set of training examples to fine-tune models for ASQP is a resource-intensive process. In this study, we explore the capabilities of large language models (LLMs) for zero- and few-shot learning on the ASQP task across five diverse datasets. We report F1 scores slightly below those obtained with state-of-the-art fine-tuned models but exceeding previously reported zero- and few-shot performance. In the 40-shot setting on the Rest16 restaurant domain dataset, LLMs achieved an F1 score of 52.46, compared to 60.39 by the best-performing fine-tuned method MVP. Additionally, we report the performance of LLMs in target aspect sentiment detection (TASD), where the F1 scores were also close to fine-tuned models, achieving 66.03 on Rest16 in the 40-shot setting, compared to 72.76 with MVP. While human annotators remain essential for achieving optimal performance, LLMs can reduce the need for extensive manual annotation in ASQP tasks.

Via

Access Paper or Ask Questions

Detecting Calls to Action in Multimodal Content: Analysis of the 2021 German Federal Election Campaign on Instagram

Sep 04, 2024

Michael Achmann-Denkler, Jakob Fehle, Mario Haim, Christian Wolff

Figure 1 for Detecting Calls to Action in Multimodal Content: Analysis of the 2021 German Federal Election Campaign on Instagram

Figure 2 for Detecting Calls to Action in Multimodal Content: Analysis of the 2021 German Federal Election Campaign on Instagram

Figure 3 for Detecting Calls to Action in Multimodal Content: Analysis of the 2021 German Federal Election Campaign on Instagram

Figure 4 for Detecting Calls to Action in Multimodal Content: Analysis of the 2021 German Federal Election Campaign on Instagram

Abstract:This study investigates the automated classification of Calls to Action (CTAs) within the 2021 German Instagram election campaign to advance the understanding of mobilization in social media contexts. We analyzed over 2,208 Instagram stories and 712 posts using fine-tuned BERT models and OpenAI's GPT-4 models. The fine-tuned BERT model incorporating synthetic training data achieved a macro F1 score of 0.93, demonstrating a robust classification performance. Our analysis revealed that 49.58% of Instagram posts and 10.64% of stories contained CTAs, highlighting significant differences in mobilization strategies between these content types. Additionally, we found that FDP and the Greens had the highest prevalence of CTAs in posts, whereas CDU and CSU led in story CTAs.

* Accepted Archival Paper for the CPSS Workshop at KONVENS 2024. Camera Ready Submission

Via

Access Paper or Ask Questions

GERestaurant: A German Dataset of Annotated Restaurant Reviews for Aspect-Based Sentiment Analysis

Aug 15, 2024

Nils Constantin Hellwig, Jakob Fehle, Markus Bink, Christian Wolff

Figure 1 for GERestaurant: A German Dataset of Annotated Restaurant Reviews for Aspect-Based Sentiment Analysis

Figure 2 for GERestaurant: A German Dataset of Annotated Restaurant Reviews for Aspect-Based Sentiment Analysis

Figure 3 for GERestaurant: A German Dataset of Annotated Restaurant Reviews for Aspect-Based Sentiment Analysis

Figure 4 for GERestaurant: A German Dataset of Annotated Restaurant Reviews for Aspect-Based Sentiment Analysis

Abstract:We present GERestaurant, a novel dataset consisting of 3,078 German language restaurant reviews manually annotated for Aspect-Based Sentiment Analysis (ABSA). All reviews were collected from Tripadvisor, covering a diverse selection of restaurants, including regional and international cuisine with various culinary styles. The annotations encompass both implicit and explicit aspects, including all aspect terms, their corresponding aspect categories, and the sentiments expressed towards them. Furthermore, we provide baseline scores for the four ABSA tasks Aspect Category Detection, Aspect Category Sentiment Analysis, End-to-End ABSA and Target Aspect Sentiment Detection as a reference point for future advances. The dataset fills a gap in German language resources and facilitates exploration of ABSA in the restaurant domain.

* Accepted in KONVENS 2024. Camera Ready submission

Via

Access Paper or Ask Questions

Innovations in Cover Song Detection: A Lyrics-Based Approach

Jun 06, 2024

Maximilian Balluff, Peter Mandl, Christian Wolff

Figure 1 for Innovations in Cover Song Detection: A Lyrics-Based Approach

Figure 2 for Innovations in Cover Song Detection: A Lyrics-Based Approach

Figure 3 for Innovations in Cover Song Detection: A Lyrics-Based Approach

Figure 4 for Innovations in Cover Song Detection: A Lyrics-Based Approach

Abstract:Cover songs are alternate versions of a song by a different artist. Long being a vital part of the music industry, cover songs significantly influence music culture and are commonly heard in public venues. The rise of online music platforms has further increased their prevalence, often as background music or video soundtracks. While current automatic identification methods serve adequately for original songs, they are less effective with cover songs, primarily because cover versions often significantly deviate from the original compositions. In this paper, we propose a novel method for cover song detection that utilizes the lyrics of a song. We introduce a new dataset for cover songs and their corresponding originals. The dataset contains 5078 cover songs and 2828 original songs. In contrast to other cover song datasets, it contains the annotated lyrics for the original song and the cover song. We evaluate our method on this dataset and compare it with multiple baseline approaches. Our results show that our method outperforms the baseline approaches.

* 6 pages, 3 figures

Via

Access Paper or Ask Questions