Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Text Classification": models, code, and papers

Prompt Tuning on Graph-augmented Low-resource Text Classification

Jul 15, 2023
Zhihao Wen, Yuan Fang

Figure 1 for Prompt Tuning on Graph-augmented Low-resource Text Classification

Figure 2 for Prompt Tuning on Graph-augmented Low-resource Text Classification

Figure 3 for Prompt Tuning on Graph-augmented Low-resource Text Classification

Figure 4 for Prompt Tuning on Graph-augmented Low-resource Text Classification

Text classification is a fundamental problem in information retrieval with many real-world applications, such as predicting the topics of online articles and the categories of e-commerce product descriptions. However, low-resource text classification, with no or few labeled samples, presents a serious concern for supervised learning. Meanwhile, many text data are inherently grounded on a network structure, such as a hyperlink/citation network for online articles, and a user-item purchase network for e-commerce products. These graph structures capture rich semantic relationships, which can potentially augment low-resource text classification. In this paper, we propose a novel model called Graph-Grounded Pre-training and Prompting (G2P2) to address low-resource text classification in a two-pronged approach. During pre-training, we propose three graph interaction-based contrastive strategies to jointly pre-train a graph-text model; during downstream classification, we explore handcrafted discrete prompts and continuous prompt tuning for the jointly pre-trained model to achieve zero- and few-shot classification, respectively. Besides, for generalizing continuous prompts to unseen classes, we propose conditional prompt tuning on graphs (G2P2$^*$). Extensive experiments on four real-world datasets demonstrate the strength of G2P2 in zero- and few-shot low-resource text classification tasks, and illustrate the advantage of G2P2$^*$ in dealing with unseen classes.

* 14 pages, journal under review. arXiv admin note: substantial text overlap with arXiv:2305.03324

Via

Access Paper or Ask Questions

Exploring Machine Learning and Transformer-based Approaches for Deceptive Text Classification: A Comparative Analysis

Aug 11, 2023
Anusuya Krishnan

Figure 1 for Exploring Machine Learning and Transformer-based Approaches for Deceptive Text Classification: A Comparative Analysis

Figure 2 for Exploring Machine Learning and Transformer-based Approaches for Deceptive Text Classification: A Comparative Analysis

Figure 3 for Exploring Machine Learning and Transformer-based Approaches for Deceptive Text Classification: A Comparative Analysis

Figure 4 for Exploring Machine Learning and Transformer-based Approaches for Deceptive Text Classification: A Comparative Analysis

Deceptive text classification is a critical task in natural language processing that aims to identify deceptive o fraudulent content. This study presents a comparative analysis of machine learning and transformer-based approaches for deceptive text classification. We investigate the effectiveness of traditional machine learning algorithms and state-of-the-art transformer models, such as BERT, XLNET, DistilBERT, and RoBERTa, in detecting deceptive text. A labeled dataset consisting of deceptive and non-deceptive texts is used for training and evaluation purposes. Through extensive experimentation, we compare the performance metrics, including accuracy, precision, recall, and F1 score, of the different approaches. The results of this study shed light on the strengths and limitations of machine learning and transformer-based methods for deceptive text classification, enabling researchers and practitioners to make informed decisions when dealing with deceptive content.

* 12 pages, 8 figures

Via

Access Paper or Ask Questions

Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking

Nov 10, 2023
Lefteris Loukas, Ilias Stogiannidis, Odysseas Diamantopoulos, Prodromos Malakasiotis, Stavros Vassos

Figure 1 for Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking

Figure 2 for Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking

Figure 3 for Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking

Figure 4 for Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking

Standard Full-Data classifiers in NLP demand thousands of labeled examples, which is impractical in data-limited domains. Few-shot methods offer an alternative, utilizing contrastive learning techniques that can be effective with as little as 20 examples per class. Similarly, Large Language Models (LLMs) like GPT-4 can perform effectively with just 1-5 examples per class. However, the performance-cost trade-offs of these methods remain underexplored, a critical concern for budget-limited organizations. Our work addresses this gap by studying the aforementioned approaches over the Banking77 financial intent detection dataset, including the evaluation of cutting-edge LLMs by OpenAI, Cohere, and Anthropic in a comprehensive set of few-shot scenarios. We complete the picture with two additional methods: first, a cost-effective querying method for LLMs based on retrieval-augmented generation (RAG), able to reduce operational costs multiple times compared to classic few-shot approaches, and second, a data augmentation method using GPT-4, able to improve performance in data-limited scenarios. Finally, to inspire future research, we provide a human expert's curated subset of Banking77, along with extensive error analysis.

* Long paper accepted to ACM ICAIF-23

Via

Access Paper or Ask Questions

GIELLM: Japanese General Information Extraction Large Language Model Utilizing Mutual Reinforcement Effect

Nov 12, 2023
Chengguang Gan, Qinghao Zhang, Tatsunori Mori

Information Extraction (IE) stands as a cornerstone in natural language processing, traditionally segmented into distinct sub-tasks. The advent of Large Language Models (LLMs) heralds a paradigm shift, suggesting the feasibility of a singular model addressing multiple IE subtasks. In this vein, we introduce the General Information Extraction Large Language Model (GIELLM), which integrates text Classification, Sentiment Analysis, Named Entity Recognition, Relation Extraction, and Event Extraction using a uniform input-output schema. This innovation marks the first instance of a model simultaneously handling such a diverse array of IE subtasks. Notably, the GIELLM leverages the Mutual Reinforcement Effect (MRE), enhancing performance in integrated tasks compared to their isolated counterparts. Our experiments demonstrate State-of-the-Art (SOTA) results in five out of six Japanese mixed datasets, significantly surpassing GPT-3.5-Turbo. Further, an independent evaluation using the novel Text Classification Relation and Event Extraction(TCREE) dataset corroborates the synergistic advantages of MRE in text and word classification. This breakthrough paves the way for most IE subtasks to be subsumed under a singular LLM framework. Specialized fine-tune task-specific models are no longer needed.

* 10 pages, 6 figures

Via

Access Paper or Ask Questions

Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Sep 24, 2023
Muberra Ozmen, Joseph Cotnareanu, Mark Coates

Figure 1 for Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Figure 2 for Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Figure 3 for Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Figure 4 for Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains. Most existing approaches require an enormous amount of annotated data to learn a classifier and/or a set of well-defined constraints on the label space structure, such as hierarchical relations which may be complicated to provide as the number of labels increases. In this paper, we study the MLTC problem in annotation-free and scarce-annotation settings in which the magnitude of available supervision signals is linear to the number of labels. Our method follows three steps, (1) mapping input text into a set of preliminary label likelihoods by natural language inference using a pre-trained language model, (2) calculating a signed label dependency graph by label descriptions, and (3) updating the preliminary label likelihoods with message passing along the label dependency graph, driven with a collective loss function that injects the information of expected label frequency and average multi-label cardinality of predictions. The experiments show that the proposed framework achieves effective performance under low supervision settings with almost imperceptible computational and memory overheads added to the usage of pre-trained language model outperforming its initial performance by 70\% in terms of example-based F1 score.

* Proc. Conf. Lifelong Learning Agents (CoLLAs), 2023

Via

Access Paper or Ask Questions

Efficient Trigger Word Insertion

Nov 23, 2023
Yueqi Zeng, Ziqiang Li, Pengfei Xia, Lei Liu, Bin Li

With the boom in the natural language processing (NLP) field these years, backdoor attacks pose immense threats against deep neural network models. However, previous works hardly consider the effect of the poisoning rate. In this paper, our main objective is to reduce the number of poisoned samples while still achieving a satisfactory Attack Success Rate (ASR) in text backdoor attacks. To accomplish this, we propose an efficient trigger word insertion strategy in terms of trigger word optimization and poisoned sample selection. Extensive experiments on different datasets and models demonstrate that our proposed method can significantly improve attack effectiveness in text classification tasks. Remarkably, our approach achieves an ASR of over 90% with only 10 poisoned samples in the dirty-label setting and requires merely 1.5% of the training data in the clean-label setting.

Via

Access Paper or Ask Questions

MatchXML: An Efficient Text-label Matching Framework for Extreme Multi-label Text Classification

Aug 25, 2023
Hui Ye, Rajshekhar Sunderraman, Shihao Ji

Figure 1 for MatchXML: An Efficient Text-label Matching Framework for Extreme Multi-label Text Classification

Figure 2 for MatchXML: An Efficient Text-label Matching Framework for Extreme Multi-label Text Classification

Figure 3 for MatchXML: An Efficient Text-label Matching Framework for Extreme Multi-label Text Classification

Figure 4 for MatchXML: An Efficient Text-label Matching Framework for Extreme Multi-label Text Classification

The eXtreme Multi-label text Classification(XMC) refers to training a classifier that assigns a text sample with relevant labels from an extremely large-scale label set (e.g., millions of labels). We propose MatchXML, an efficient text-label matching framework for XMC. We observe that the label embeddings generated from the sparse Term Frequency-Inverse Document Frequency(TF-IDF) features have several limitations. We thus propose label2vec to effectively train the semantic dense label embeddings by the Skip-gram model. The dense label embeddings are then used to build a Hierarchical Label Tree by clustering. In fine-tuning the pre-trained encoder Transformer, we formulate the multi-label text classification as a text-label matching problem in a bipartite graph. We then extract the dense text representations from the fine-tuned Transformer. Besides the fine-tuned dense text embeddings, we also extract the static dense sentence embeddings from a pre-trained Sentence Transformer. Finally, a linear ranker is trained by utilizing the sparse TF-IDF features, the fine-tuned dense text representations and static dense sentence features. Experimental results demonstrate that MatchXML achieves state-of-the-art accuracy on five out of six datasets. As for the speed, MatchXML outperforms the competing methods on all the six datasets. Our source code is publicly available at https://github.com/huiyegit/MatchXML.

Via

Access Paper or Ask Questions

Short text classification with machine learning in the social sciences: The case of climate change on Twitter

Oct 03, 2023
Karina Shyrokykh, Maksym Girnyk, Lisa Dellmuth

Figure 1 for Short text classification with machine learning in the social sciences: The case of climate change on Twitter

Figure 2 for Short text classification with machine learning in the social sciences: The case of climate change on Twitter

Figure 3 for Short text classification with machine learning in the social sciences: The case of climate change on Twitter

Figure 4 for Short text classification with machine learning in the social sciences: The case of climate change on Twitter

To analyse large numbers of texts, social science researchers are increasingly confronting the challenge of text classification. When manual labeling is not possible and researchers have to find automatized ways to classify texts, computer science provides a useful toolbox of machine-learning methods whose performance remains understudied in the social sciences. In this article, we compare the performance of the most widely used text classifiers by applying them to a typical research scenario in social science research: a relatively small labeled dataset with infrequent occurrence of categories of interest, which is a part of a large unlabeled dataset. As an example case, we look at Twitter communication regarding climate change, a topic of increasing scholarly interest in interdisciplinary social science research. Using a novel dataset including 5,750 tweets from various international organizations regarding the highly ambiguous concept of climate change, we evaluate the performance of methods in automatically classifying tweets based on whether they are about climate change or not. In this context, we highlight two main findings. First, supervised machine-learning methods perform better than state-of-the-art lexicons, in particular as class balance increases. Second, traditional machine-learning methods, such as logistic regression and random forest, perform similarly to sophisticated deep-learning methods, whilst requiring much less training time and computational resources. The results have important implications for the analysis of short texts in social science research.

* PLoS ONE 18(9): e0290762 (2023)

Via

Access Paper or Ask Questions

USA: Universal Sentiment Analysis Model & Construction of Japanese Sentiment Text Classification and Part of Speech Dataset

Sep 14, 2023
Chengguang Gan, Qinghao Zhang, Tatsunori Mori

Figure 1 for USA: Universal Sentiment Analysis Model & Construction of Japanese Sentiment Text Classification and Part of Speech Dataset

Figure 2 for USA: Universal Sentiment Analysis Model & Construction of Japanese Sentiment Text Classification and Part of Speech Dataset

Figure 3 for USA: Universal Sentiment Analysis Model & Construction of Japanese Sentiment Text Classification and Part of Speech Dataset

Figure 4 for USA: Universal Sentiment Analysis Model & Construction of Japanese Sentiment Text Classification and Part of Speech Dataset

Sentiment analysis is a pivotal task in the domain of natural language processing. It encompasses both text-level sentiment polarity classification and word-level Part of Speech(POS) sentiment polarity determination. Such analysis challenges models to understand text holistically while also extracting nuanced information. With the rise of Large Language Models(LLMs), new avenues for sentiment analysis have opened. This paper proposes enhancing performance by leveraging the Mutual Reinforcement Effect(MRE) between individual words and the overall text. It delves into how word polarity influences the overarching sentiment of a passage. To support our research, we annotated four novel Sentiment Text Classification and Part of Speech(SCPOS) datasets, building upon existing sentiment classification datasets. Furthermore, we developed a Universal Sentiment Analysis(USA) model, with a 7-billion parameter size. Experimental results revealed that our model surpassed the performance of gpt-3.5-turbo across all four datasets, underscoring the significance of MRE in sentiment analysis.

* Model already Open Sourced, Dataset will release soon

Via

Access Paper or Ask Questions

DP-AdamBC: Your DP-Adam Is Actually DP-SGD (Unless You Apply Bias Correction)

Dec 21, 2023
Qiaoyue Tang, Frederick Shpilevskiy, Mathias Lécuyer

The Adam optimizer is a popular choice in contemporary deep learning, due to its strong empirical performance. However we observe that in privacy sensitive scenarios, the traditional use of Differential Privacy (DP) with the Adam optimizer leads to sub-optimal performance on several tasks. We find that this performance degradation is due to a DP bias in Adam's second moment estimator, introduced by the addition of independent noise in the gradient computation to enforce DP guarantees. This DP bias leads to a different scaling for low variance parameter updates, that is inconsistent with the behavior of non-private Adam. We propose DP-AdamBC, an optimization algorithm which removes the bias in the second moment estimation and retrieves the expected behaviour of Adam. Empirically, DP-AdamBC significantly improves the optimization performance of DP-Adam by up to 3.5% in final accuracy in image, text, and graph node classification tasks.

* Published as a conference paper at the 38th Annual AAAI Conference on Artificial Intelligence, Vancouver, 2024

Via

Access Paper or Ask Questions