Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuefeng Shi

Text Analytics Evaluation Framework: A Case Study on LLMs and Social Media

May 20, 2026

Yuefeng Shi, Nedjma Ousidhoum, Jose Camacho-Collados

Abstract:LLMs have demonstrated exceptional proficiency in a wide range of NLP tasks. However, a notable gap remains in practical data analysis scenarios, particularly when LLMs are required to process long sequences of unstructured documents, such as news feeds or, as specifically addressed in this paper, social media posts. To empirically assess the effectiveness of LLMs in this setting, we introduce a question-based evaluation framework comprising 470 manually curated questions designed to evaluate LLMs' semantic understanding and reasoning abilities over aggregated text data. We apply our benchmark on diverse Twitter datasets covering various NLP tasks, including sentiment analysis, hate speech detection, and emotion recognition. Our results reveal that the performance depends heavily on input scale and the complexity of the data sources, declining noticeably in multi-label or target-dependent scenarios. In addition, as task complexity increases, performance drops progressively from basic semantic existence identification to more demanding operations such as comparison, counting, and calculation. Furthermore, as the input size grows beyond 500 instances, we identify a common limitation across LLMs, particularly Open-weights models: performance degrades substantially, especially on numerical tasks. These findings highlight critical architectural bottlenecks in current LLMs for performing rigorous quantitative analysis over large text collections.

Via

Access Paper or Ask Questions

Exploiting Sentiment and Common Sense for Zero-shot Stance Detection

Aug 18, 2022

Yun Luo, Zihan Liu, Yuefeng Shi, Yue Zhang

Figure 1 for Exploiting Sentiment and Common Sense for Zero-shot Stance Detection

Figure 2 for Exploiting Sentiment and Common Sense for Zero-shot Stance Detection

Figure 3 for Exploiting Sentiment and Common Sense for Zero-shot Stance Detection

Figure 4 for Exploiting Sentiment and Common Sense for Zero-shot Stance Detection

Abstract:The stance detection task aims to classify the stance toward given documents and topics. Since the topics can be implicit in documents and unseen in training data for zero-shot settings, we propose to boost the transferability of the stance detection model by using sentiment and commonsense knowledge, which are seldom considered in previous studies. Our model includes a graph autoencoder module to obtain commonsense knowledge and a stance detection module with sentiment and commonsense. Experimental results show that our model outperforms the state-of-the-art methods on the zero-shot and few-shot benchmark dataset--VAST. Meanwhile, ablation studies prove the significance of each module in our model. Analysis of the relations between sentiment, common sense, and stance indicates the effectiveness of sentiment and common sense.

Via

Access Paper or Ask Questions

A Pilot Study for Chinese SQL Semantic Parsing

Oct 16, 2019

Qingkai Min, Yuefeng Shi, Yue Zhang

Figure 1 for A Pilot Study for Chinese SQL Semantic Parsing

Figure 2 for A Pilot Study for Chinese SQL Semantic Parsing

Figure 3 for A Pilot Study for Chinese SQL Semantic Parsing

Figure 4 for A Pilot Study for Chinese SQL Semantic Parsing

Abstract:The task of semantic parsing is highly useful for dialogue and question answering systems. Many datasets have been proposed to map natural language text into SQL, among which the recent Spider dataset provides cross-domain samples with multiple tables and complex queries. We build a Spider dataset for Chinese, which is currently a low-resource language in this task area. Interesting research questions arise from the uniqueness of the language, which requires word segmentation, and also from the fact that SQL keywords and columns of DB tables are typically written in English. We compare character- and word-based encoders for a semantic parser, and different embedding schemes. Results show that word-based semantic parser is subject to segmentation errors and cross-lingual word embeddings are useful for text-to-SQL.

* EMNLP 2019

Via

Access Paper or Ask Questions