Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Text Classification": models, code, and papers

Multinomial Adversarial Networks for Multi-Domain Text Classification

Feb 15, 2018
Xilun Chen, Claire Cardie

Many text classification tasks are known to be highly domain-dependent. Unfortunately, the availability of training data can vary drastically across domains. Worse still, for some domains there may not be any annotated data at all. In this work, we propose a multinomial adversarial network (MAN) to tackle the text classification problem in this real-world multidomain setting (MDTC). We provide theoretical justifications for the MAN framework, proving that different instances of MANs are essentially minimizers of various f-divergence metrics (Ali and Silvey, 1966) among multiple probability distributions. MANs are thus a theoretically sound generalization of traditional adversarial networks that discriminate over two distributions. More specifically, for the MDTC task, MAN learns features that are invariant across multiple domains by resorting to its ability to reduce the divergence among the feature distributions of each domain. We present experimental results showing that MANs significantly outperform the prior art on the MDTC task. We also show that MANs achieve state-of-the-art performance for domains with no labeled data.

* NAACL 2018 
Access Paper or Ask Questions

Text-Attentional Convolutional Neural Networks for Scene Text Detection

Mar 24, 2016
Tong He, Weilin Huang, Yu Qiao, Jian Yao

Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature computed globally from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this work, we present a new system for scene text detection by proposing a novel Text-Attentional Convolutional Neural Network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/nontext information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates main task of text/non-text classification. In addition, a powerful low-level detector called Contrast- Enhancement Maximally Stable Extremal Regions (CE-MSERs) is developed, which extends the widely-used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 dataset, with a F-measure of 0.82, improving the state-of-the-art results substantially.

* To appear in IEEE Trans. on Image Processing, 2016 
Access Paper or Ask Questions

RAFT: A Real-World Few-Shot Text Classification Benchmark

Sep 28, 2021
Neel Alex, Eli Lifland, Lewis Tunstall, Abhishek Thakur, Pegah Maham, C. Jess Riedel, Emmie Hine, Carolyn Ashurst, Paul Sedille, Alexis Carlier, Michael Noetel, Andreas Stuhlmüller

Large pre-trained language models have shown promise for few-shot learning, completing text-based tasks given only a few task-specific examples. Will models soon solve classification tasks that have so far been reserved for human research assistants? Existing benchmarks are not designed to measure progress in applied settings, and so don't directly answer this question. The RAFT benchmark (Real-world Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment. Baseline evaluations on RAFT reveal areas current techniques struggle with: reasoning over long texts and tasks with many classes. Human baselines show that some classification tasks are difficult for non-expert humans, reflecting that real-world value sometimes depends on domain expertise. Yet even non-expert human baseline F1 scores exceed GPT-3 by an average of 0.11. The RAFT datasets and leaderboard will track which model improvements translate into real-world benefits at .

* Dataset, submission instructions, code and leaderboard available at 
Access Paper or Ask Questions

Data Mining of Causal Relations from Text: Analysing Maritime Accident Investigation Reports

Jul 09, 2015
Santosh Tirunagari

Text mining is a process of extracting information of interest from text. Such a method includes techniques from various areas such as Information Retrieval (IR), Natural Language Processing (NLP), and Information Extraction (IE). In this study, text mining methods are applied to extract causal relations from maritime accident investigation reports collected from the Marine Accident Investigation Branch (MAIB). These causal relations provide information on various mechanisms behind accidents, including human and organizational factors relating to the accident. The objective of this study is to facilitate the analysis of the maritime accident investigation reports, by means of extracting contributory causes with more feasibility. A careful investigation of contributory causes from the reports provide opportunity to improve safety in future. Two methods have been employed in this study to extract the causal relations. They are 1) Pattern classification method and 2) Connectives method. The earlier one uses naive Bayes and Support Vector Machines (SVM) as classifiers. The latter simply searches for the words connecting cause and effect in sentences. The causal patterns extracted using these two methods are compared to the manual (human expert) extraction. The pattern classification method showed a fair and sensible performance with F-measure(average) = 65% when compared to connectives method with F-measure(average) = 58%. This study is an evidence, that text mining methods could be employed in extracting causal relations from marine accident investigation reports.

Access Paper or Ask Questions

Short-Text Classification Using Unsupervised Keyword Expansion

Sep 16, 2019
Duncan Cameron-Steinke

Short-text classification, like all data science, struggles to achieve high performance using limited data. As a solution, a short sentence may be expanded with new and relevant feature words to form an artificially enlarged dataset, and add new features to testing data. This paper applies a novel approach to text expansion by generating new words directly for each input sentence, thus requiring no additional datasets or previous training. In this unsupervised approach, new keywords are formed within the hidden states of a pre-trained language model and then used to create extended pseudo documents. The word generation process was assessed by examining how well the predicted words matched to topics of the input sentence. It was found that this method could produce 3-10 relevant new words for each target topic, while generating just 1 word related to each non-target topic. Generated words were then added to short news headlines to create extended pseudo headlines. Experimental results have shown that models trained using the pseudo headlines can improve classification accuracy when limiting the number of training examples.

Access Paper or Ask Questions

Effectiveness of Self Normalizing Neural Networks for Text Classification

May 03, 2019
Avinash Madasu, Vijjini Anvesh Rao

Self Normalizing Neural Networks(SNN) proposed on Feed Forward Neural Networks(FNN) outperform regular FNN architectures in various machine learning tasks. Particularly in the domain of Computer Vision, the activation function Scaled Exponential Linear Units (SELU) proposed for SNNs, perform better than other non linear activations such as ReLU. The goal of SNN is to produce a normalized output for a normalized input. Established neural network architectures like feed forward networks and Convolutional Neural Networks(CNN) lack the intrinsic nature of normalizing outputs. Hence, requiring additional layers such as Batch Normalization. Despite the success of SNNs, their characteristic features on other network architectures like CNN haven't been explored, especially in the domain of Natural Language Processing. In this paper we aim to show the effectiveness of proposed, Self Normalizing Convolutional Neural Networks(SCNN) on text classification. We analyze their performance with the standard CNN architecture used on several text classification datasets. Our experiments demonstrate that SCNN achieves comparable results to standard CNN model with significantly fewer parameters. Furthermore it also outperforms CNN with equal number of parameters.

* Accepted Long Paper at 20th International Conference on Computational Linguistics and Intelligent Text Processing, April 2019, La Rochelle, France 
Access Paper or Ask Questions

Meta-Learning Adversarial Domain Adaptation Network for Few-Shot Text Classification

Jul 26, 2021
ChengCheng Han, Zeqiu Fan, Dongxiang Zhang, Minghui Qiu, Ming Gao, Aoying Zhou

Meta-learning has emerged as a trending technique to tackle few-shot text classification and achieved state-of-the-art performance. However, existing solutions heavily rely on the exploitation of lexical features and their distributional signatures on training data, while neglecting to strengthen the model's ability to adapt to new tasks. In this paper, we propose a novel meta-learning framework integrated with an adversarial domain adaptation network, aiming to improve the adaptive ability of the model and generate high-quality text embedding for new classes. Extensive experiments are conducted on four benchmark datasets and our method demonstrates clear superiority over the state-of-the-art models in all the datasets. In particular, the accuracy of 1-shot and 5-shot classification on the dataset of 20 Newsgroups is boosted from 52.1% to 59.6%, and from 68.3% to 77.8%, respectively.

Access Paper or Ask Questions

Privacy enabled Financial Text Classification using Differential Privacy and Federated Learning

Oct 04, 2021
Priyam Basu, Tiasa Singha Roy, Rakshit Naidu, Zumrut Muftuoglu

Privacy is important considering the financial Domain as such data is highly confidential and sensitive. Natural Language Processing (NLP) techniques can be applied for text classification and entity detection purposes in financial domains such as customer feedback sentiment analysis, invoice entity detection, categorisation of financial documents by type etc. Due to the sensitive nature of such data, privacy measures need to be taken for handling and training large models with such data. In this work, we propose a contextualized transformer (BERT and RoBERTa) based text classification model integrated with privacy features such as Differential Privacy (DP) and Federated Learning (FL). We present how to privately train NLP models and desirable privacy-utility tradeoffs and evaluate them on the Financial Phrase Bank dataset.

* 4 pages. Accepted at ECONLP-EMNLP'21 
Access Paper or Ask Questions

Topics to Avoid: Demoting Latent Confounds in Text Classification

Sep 01, 2019
Sachin Kumar, Shuly Wintner, Noah A. Smith, Yulia Tsvetkov

Despite impressive performance on many text classification tasks, deep neural networks tend to learn frequent superficial patterns that are specific to the training data and do not always generalize well. In this work, we observe this limitation with respect to the task of native language identification. We find that standard text classifiers which perform well on the test set end up learning topical features which are confounds of the prediction task (e.g., if the input text mentions Sweden, the classifier predicts that the author's native language is Swedish). We propose a method that represents the latent topical confounds and a model which "unlearns" confounding features by predicting both the label of the input text and the confound; but we train the two predictors adversarially in an alternating fashion to learn a text representation that predicts the correct label but is less prone to using information about the confound. We show that this model generalizes better and learns features that are indicative of the writing style rather than the content.

* 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019) 
Access Paper or Ask Questions

Memory Efficient Continual Learning for Neural Text Classification

Mar 09, 2022
Beyza Ermis, Giovanni Zappella, Martin Wistuba, Cedric Archambeau

Learning text classifiers based on pre-trained language models has become the standard practice in natural language processing applications. Unfortunately, training large neural language models, such as transformers, from scratch is very costly and requires a vast amount of training data, which might not be available in the application domain of interest. Moreover, in many real-world scenarios, classes are uncovered as more data is seen, calling for class-incremental modelling approaches. In this work we devise a method to perform text classification using pre-trained models on a sequence of classification tasks provided in sequence. We formalize the problem as a continual learning problem where the algorithm learns new tasks without performance degradation on the previous ones and without re-training the model from scratch. We empirically demonstrate that our method requires significantly less model parameters compared to other state of the art methods and that it is significantly faster at inference time. The tight control on the number of model parameters, and so the memory, is not only improving efficiency. It is making possible the usage of the algorithm in real-world applications where deploying a solution with a constantly increasing memory consumption is just unrealistic. While our method suffers little forgetting, it retains a predictive performance on-par with state of the art but less memory efficient methods.

Access Paper or Ask Questions