Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Frederico Souza

BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives

Jan 10, 2022

Frederico Souza, João Filho

Figure 1 for BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives

Figure 2 for BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives

Figure 3 for BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives

Figure 4 for BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives

Abstract:BERT has revolutionized the NLP field by enabling transfer learning with large language models that can capture complex textual patterns, reaching the state-of-the-art for an expressive number of NLP applications. For text classification tasks, BERT has already been extensively explored. However, aspects like how to better cope with the different embeddings provided by the BERT output layer and the usage of language-specific instead of multilingual models are not well studied in the literature, especially for the Brazilian Portuguese language. The purpose of this article is to conduct an extensive experimental study regarding different strategies for aggregating the features produced in the BERT output layer, with a focus on the sentiment analysis task. The experiments include BERT models trained with Brazilian Portuguese corpora and the multilingual version, contemplating multiple aggregation strategies and open-source datasets with predefined training, validation, and test partitions to facilitate the reproducibility of the results. BERT achieved the highest ROC-AUC values for the majority of cases as compared to TF-IDF. Nonetheless, TF-IDF represents a good trade-off between the predictive performance and computational cost.

* 10 pages, 1 figure, 3 tables. Accepted at International Conference on the Computational Processing of Portuguese (PROPOR 2022), but not yet published

Via

Access Paper or Ask Questions

Sentiment Analysis on Brazilian Portuguese User Reviews

Dec 10, 2021

Frederico Souza, João Filho

Figure 1 for Sentiment Analysis on Brazilian Portuguese User Reviews

Figure 2 for Sentiment Analysis on Brazilian Portuguese User Reviews

Figure 3 for Sentiment Analysis on Brazilian Portuguese User Reviews

Figure 4 for Sentiment Analysis on Brazilian Portuguese User Reviews

Abstract:Sentiment Analysis is one of the most classical and primarily studied natural language processing tasks. This problem had a notable advance with the proposition of more complex and scalable machine learning models. Despite this progress, the Brazilian Portuguese language still disposes only of limited linguistic resources, such as datasets dedicated to sentiment classification, especially when considering the existence of predefined partitions in training, testing, and validation sets that would allow a more fair comparison of different algorithm alternatives. Motivated by these issues, this work analyzes the predictive performance of a range of document embedding strategies, assuming the polarity as the system outcome. This analysis includes five sentiment analysis datasets in Brazilian Portuguese, unified in a single dataset, and a reference partitioning in training, testing, and validation sets, both made publicly available through a digital repository. A cross-evaluation of dataset-specific models over different contexts is conducted to evaluate their generalization capabilities and the feasibility of adopting a unique model for addressing all scenarios.

* 6 pages, 2 figures, 6 tables. Accepted and presented at IEEE Latin American Conference on Computational Intelligence (LA-CCI 2021), but not yet published

Via

Access Paper or Ask Questions