Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zihan Mei

GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs

Jun 02, 2026

Weidong Tang, Jierui Li, Yueling Hou, Zihan Mei, Can Zhang, Xinyan Wan, Zhiyuan Liang, Pengfei Zhou, Yang You, Wangbo Zhao

Abstract:True general intelligence requires not only a model of the physical world but also a social world model: the capacity to infer how individual mental states interact and crystallize into group-level outcomes. Despite notable progress in individual-level Theory of Mind (ToM) reasoning, existing multimodal large language models fail at this broader task. Collective behavior emerges non-linearly from social tensions, conformity dynamics, and structural constraints, meaning it cannot be recovered by merely summing individual intentions. We present GroupToM-Bench, the first multimodal benchmark for group-level ToM, built around a causal chain spanning micro-level BDI states (belief, desire, intention), meso-level group tension and structural constraints, and macro-level outcome prediction and mechanistic attribution. To probe this full arc, we develop a seven-level cognitive audit framework. Experiments reveal a gap between current models and human baselines, highlighting a failure to process social structures and non-linear collective dynamics.

* Accepted by ACL 2026

Via

Access Paper or Ask Questions

Multi-Attribute Attention Network for Interpretable Diagnosis of Thyroid Nodules in Ultrasound Images

Jul 09, 2022

Van T. Manh, Jianqiao Zhou, Xiaohong Jia, Zehui Lin, Wenwen Xu, Zihan Mei, Yijie Dong, Xin Yang, Ruobing Huang, Dong Ni

Figure 1 for Multi-Attribute Attention Network for Interpretable Diagnosis of Thyroid Nodules in Ultrasound Images

Figure 2 for Multi-Attribute Attention Network for Interpretable Diagnosis of Thyroid Nodules in Ultrasound Images

Figure 3 for Multi-Attribute Attention Network for Interpretable Diagnosis of Thyroid Nodules in Ultrasound Images

Figure 4 for Multi-Attribute Attention Network for Interpretable Diagnosis of Thyroid Nodules in Ultrasound Images

Abstract:Ultrasound (US) is the primary imaging technique for the diagnosis of thyroid cancer. However, accurate identification of nodule malignancy is a challenging task that can elude less-experienced clinicians. Recently, many computer-aided diagnosis (CAD) systems have been proposed to assist this process. However, most of them do not provide the reasoning of their classification process, which may jeopardize their credibility in practical use. To overcome this, we propose a novel deep learning framework called multi-attribute attention network (MAA-Net) that is designed to mimic the clinical diagnosis process. The proposed model learns to predict nodular attributes and infer their malignancy based on these clinically-relevant features. A multi-attention scheme is adopted to generate customized attention to improve each task and malignancy diagnosis. Furthermore, MAA-Net utilizes nodule delineations as nodules spatial prior guidance for the training rather than cropping the nodules with additional models or human interventions to prevent losing the context information. Validation experiments were performed on a large and challenging dataset containing 4554 patients. Results show that the proposed method outperformed other state-of-the-art methods and provides interpretable predictions that may better suit clinical needs.

* 11 pages, 7 figures

Via

Access Paper or Ask Questions