Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Renee Sieber

WXImpactBench: A Disruptive Weather Impact Understanding Benchmark for Evaluating Large Language Models

May 26, 2025

Yongan Yu, Qingchen Hu, Xianda Du, Jiayin Wang, Fengran Mo, Renee Sieber

Abstract:Climate change adaptation requires the understanding of disruptive weather impacts on society, where large language models (LLMs) might be applicable. However, their effectiveness is under-explored due to the difficulty of high-quality corpus collection and the lack of available benchmarks. The climate-related events stored in regional newspapers record how communities adapted and recovered from disasters. However, the processing of the original corpus is non-trivial. In this study, we first develop a disruptive weather impact dataset with a four-stage well-crafted construction pipeline. Then, we propose WXImpactBench, the first benchmark for evaluating the capacity of LLMs on disruptive weather impacts. The benchmark involves two evaluation tasks, multi-label classification and ranking-based question answering. Extensive experiments on evaluating a set of LLMs provide first-hand analysis of the challenges in developing disruptive weather impact understanding and climate change adaptation systems. The constructed dataset and the code for the evaluation framework are available to help society protect against vulnerabilities from disasters.

* Accepted by ACL 2025

Via

Access Paper or Ask Questions

Bridging the gap between supervised classification and unsupervised topic modelling for social-media assisted crisis management

Mar 22, 2021

Mikael Brunila, Rosie Zhao, Andrei Mircea, Sam Lumley, Renee Sieber

Figure 1 for Bridging the gap between supervised classification and unsupervised topic modelling for social-media assisted crisis management

Figure 2 for Bridging the gap between supervised classification and unsupervised topic modelling for social-media assisted crisis management

Figure 3 for Bridging the gap between supervised classification and unsupervised topic modelling for social-media assisted crisis management

Figure 4 for Bridging the gap between supervised classification and unsupervised topic modelling for social-media assisted crisis management

Abstract:Social media such as Twitter provide valuable information to crisis managers and affected people during natural disasters. Machine learning can help structure and extract information from the large volume of messages shared during a crisis; however, the constantly evolving nature of crises makes effective domain adaptation essential. Supervised classification is limited by unchangeable class labels that may not be relevant to new events, and unsupervised topic modelling by insufficient prior knowledge. In this paper, we bridge the gap between the two and show that BERT embeddings finetuned on crisis-related tweet classification can effectively be used to adapt to a new crisis, discovering novel topics while preserving relevant classes from supervised training, and leveraging bidirectional self-attention to extract topic keywords. We create a dataset of tweets from a snowstorm to evaluate our method's transferability to new crises, and find that it outperforms traditional topic models in both automatic, and human evaluations grounded in the needs of crisis managers. More broadly, our method can be used for textual domain adaptation where the latent classes are unknown but overlap with known classes from other domains.

* Adapt-NLP @EACL2021; first three authors contributed equally; code available at https://github.com/smacawi/bert-topics/

Via

Access Paper or Ask Questions