The Large Language Model Bias Index (LLMBI) is a pioneering approach designed to quantify and address biases inherent in large language models (LLMs), such as GPT-4. We recognise the increasing prevalence and impact of LLMs across diverse sectors. This research introduces a novel metric, LLMBI, to systematically measure and mitigate biases potentially skewing model responses. We formulated LLMBI using a composite scoring system incorporating multiple dimensions of bias, including but not limited to age, gender, and racial biases. To operationalise this metric, we engaged in a multi-step process involving collecting and annotating LLM responses, applying sophisticated Natural Language Processing (NLP) techniques for bias detection, and computing the LLMBI score through a specially crafted mathematical formula. The formula integrates weighted averages of various bias dimensions, a penalty for dataset diversity deficiencies, and a correction for sentiment biases. Our empirical analysis, conducted using responses from OpenAI's API, employs advanced sentiment analysis as a representative method for bias detection. The research reveals LLMs, whilst demonstrating impressive capabilities in text generation, exhibit varying degrees of bias across different dimensions. LLMBI provides a quantifiable measure to compare biases across models and over time, offering a vital tool for systems engineers, researchers and regulators in enhancing the fairness and reliability of LLMs. It highlights the potential of LLMs in mimicking unbiased human-like responses. Additionally, it underscores the necessity of continuously monitoring and recalibrating such models to align with evolving societal norms and ethical standards.
Aspect-based sentiment analysis (ABSA) is a widely studied topic, most often trained through supervision from human annotations of opinionated texts. These fine-grained annotations include identifying aspects towards which a user expresses their sentiment, and their associated polarities (aspect-based sentiments). Such fine-grained annotations can be expensive and often infeasible to obtain in real-world settings. There is, however, an abundance of scenarios where user-generated text contains an overall sentiment, such as a rating of 1-5 in user reviews or user-generated feedback, which may be leveraged for this task. In this paper, we propose a VAE-based topic modeling approach that performs ABSA using document-level supervision and without requiring fine-grained labels for either aspects or sentiments. Our approach allows for the detection of multiple aspects in a document, thereby allowing for the possibility of reasoning about how sentiment expressed through multiple aspects comes together to form an observable overall document-level sentiment. We demonstrate results on two benchmark datasets from two different domains, significantly outperforming a state-of-the-art baseline.
Researchers commonly perform sentiment analysis on large collections of short texts like tweets, Reddit posts or newspaper headlines that are all focused on a specific topic, theme or event. Usually, general purpose sentiment analysis methods are used which perform well on average but miss the variation in meaning that happens across different contexts, for example, the word "active" has a very different intention and valence in the phrase "active lifestyle" versus "active volcano". This work presents a new approach, CIDER (Context Informed Dictionary and sEntiment Reasoner), which performs context sensitive sentiment analysis, where the valence of sentiment laden terms is inferred from the whole corpus before being used to score the individual texts. In this paper we detail the CIDER algorithm and demonstrate that it outperforms state-of-the-art generalist sentiment analysis on a large collection of tweets about the weather. We have made our implementation of CIDER available as a python package: https://pypi.org/project/ciderpolarity/.
We propose emotion2vec, a universal speech emotion representation model. emotion2vec is pre-trained on open-source unlabeled emotion data through self-supervised online distillation, combining utterance-level loss and frame-level loss during pre-training. emotion2vec outperforms state-of-the-art pre-trained universal models and emotion specialist models by only training linear layers for the speech emotion recognition task on the mainstream IEMOCAP dataset. In addition, emotion2vec shows consistent improvements among 10 different languages of speech emotion recognition datasets. emotion2vec also shows excellent results on other emotion tasks, such as song emotion recognition, emotion prediction in conversation, and sentiment analysis. Comparison experiments, ablation experiments, and visualization comprehensively demonstrate the universal capability of the proposed emotion2vec. To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.
The increasing volume of online reviews has made possible the development of sentiment analysis models for determining the opinion of customers regarding different products and services. Until now, sentiment analysis has proven to be an effective tool for determining the overall polarity of reviews. To improve the granularity at the aspect level for a better understanding of the service or product, the task of aspect-based sentiment analysis aims to first identify aspects and then determine the user's opinion about them. The complexity of this task lies in the fact that the same review can present multiple aspects, each with its own polarity. Current solutions have poor performance on such data. We address this problem by proposing ATESA-B{\AE}RT, a heterogeneous ensemble learning model for Aspect-Based Sentiment Analysis. Firstly, we divide our problem into two sub-tasks, i.e., Aspect Term Extraction and Aspect Term Sentiment Analysis. Secondly, we use the \textit{argmax} multi-class classification on six transformers-based learners for each sub-task. Initial experiments on two datasets prove that ATESA-B{\AE}RT outperforms current state-of-the-art solutions while solving the many aspects problem.
The renowned proverb that "The pen is mightier than the sword" underscores the formidable influence wielded by text expressions in shaping sentiments. Indeed, well-crafted written can deeply resonate within cultures, conveying profound sentiments. Nowadays, the omnipresence of the Internet has fostered a subculture that congregates around the contemporary milieu. The subculture artfully articulates the intricacies of human feelings by ardently pursuing the allure of novelty, a fact that cannot be disregarded in the sentiment analysis. This paper strives to enrich data through the lens of subculture, to address the insufficient training data faced by sentiment analysis. To this end, a new approach of subculture-based data augmentation (SCDA) is proposed, which engenders six enhanced texts for each training text by leveraging the creation of six diverse subculture expression generators. The extensive experiments attest to the effectiveness and potential of SCDA. The results also shed light on the phenomenon that disparate subculture expressions elicit varying degrees of sentiment stimulation. Moreover, an intriguing conjecture arises, suggesting the linear reversibility of certain subculture expressions. It is our fervent aspiration that this study serves as a catalyst in fostering heightened perceptiveness towards the tapestry of information, sentiment and culture, thereby enriching our collective understanding.
Reddiment is a web-based dashboard that links sentiment analysis of subreddit texts with share prices. The system consists of a backend, frontend and various services. The backend, in Node.js, manages the data and communicates with crawlers that collect Reddit comments and stock market data. Sentiment is analyzed with the help of Vader and TextBlob. The frontend, based on SvelteKit, provides users with a dashboard for visualization. The distribution is carried out via Docker containers and Docker Compose. The project offers expansion options, e.g. the integration of cryptocurrency rates. Reddiment enables the analysis of sentiment and share prices from subreddit data.
This paper presents a state-of-the-art solution to the LongEval CLEF 2023 Lab Task 2: LongEval-Classification. The goal of this task is to improve and preserve the performance of sentiment analysis models across shorter and longer time periods. Our framework feeds date-prefixed textual inputs to a pre-trained language model, where the timestamp is included in the text. We show date-prefixed samples better conditions model outputs on the temporal context of the respective texts. Moreover, we further boost performance by performing self-labeling on unlabeled data to train a student model. We augment the self-labeling process using a novel augmentation strategy leveraging the date-prefixed formatting of our samples. We demonstrate concrete performance gains on the LongEval-Classification evaluation set over non-augmented self-labeling. Our framework achieves a 2nd place ranking with an overall score of 0.6923 and reports the best Relative Performance Drop (RPD) of -0.0656 over the short evaluation set.
Entity-level fine-grained sentiment analysis in the financial domain is a crucial subtask of sentiment analysis and currently faces numerous challenges. The primary challenge stems from the lack of high-quality and large-scale annotated corpora specifically designed for financial text sentiment analysis, which in turn limits the availability of data necessary for developing effective text processing techniques. Recent advancements in large language models (LLMs) have yielded remarkable performance in natural language processing tasks, primarily centered around language pattern matching. In this paper, we propose a novel and extensive Chinese fine-grained financial sentiment analysis dataset, FinChina SA, for enterprise early warning. We thoroughly evaluate and experiment with well-known existing open-source LLMs using our dataset. We firmly believe that our dataset will serve as a valuable resource to advance the exploration of real-world financial sentiment analysis tasks, which should be the focus of future research. Our dataset and all code to replicate the experimental results will be released.