Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shengyu Guo

MirrorBench: Evaluating Self-centric Intelligence in MLLMs by Introducing a Mirror

Apr 16, 2026

Shengyu Guo, Tongrui Ye, Jianbo Zhang, Zicheng Zhang, Chunyi Li, Guangtao Zhai

Abstract:Recent progress in Multimodal Large Language Models (MLLMs) has demonstrated remarkable advances in perception and reasoning, suggesting their potential for embodied intelligence. While recent studies have evaluated embodied MLLMs in interactive settings, current benchmarks mainly target capabilities to perceive, understand, and interact with external objects, lacking a systematic evaluation of self-centric intelligence. To address this, we introduce MirrorBench, a simulation-based benchmark inspired by the classical Mirror Self-Recognition (MSR) test in psychology. MirrorBench extends this paradigm to embodied MLLMs through a tiered framework of progressively challenging tasks, assessing agents from basic visual perception to high-level self-representation. Experiments on leading MLLMs show that even at the lowest level, their performance remains substantially inferior to human performance, revealing fundamental limitations in self-referential understanding. Our study bridges psychological paradigms and embodied intelligence, offering a principled framework for evaluating the emergence of general intelligence in large models. Project page: https://fflahm.github.io/mirror-bench-page/.

Via

Access Paper or Ask Questions

Improving Topic Relevance Model by Mix-structured Summarization and LLM-based Data Augmentation

Apr 03, 2024

Yizhu Liu, Ran Tao, Shengyu Guo, Yifan Yang

Figure 1 for Improving Topic Relevance Model by Mix-structured Summarization and LLM-based Data Augmentation

Figure 2 for Improving Topic Relevance Model by Mix-structured Summarization and LLM-based Data Augmentation

Figure 3 for Improving Topic Relevance Model by Mix-structured Summarization and LLM-based Data Augmentation

Figure 4 for Improving Topic Relevance Model by Mix-structured Summarization and LLM-based Data Augmentation

Abstract:Topic relevance between query and document is a very important part of social search, which can evaluate the degree of matching between document and user's requirement. In most social search scenarios such as Dianping, modeling search relevance always faces two challenges. One is that many documents in social search are very long and have much redundant information. The other is that the training data for search relevance model is difficult to get, especially for multi-classification relevance model. To tackle above two problems, we first take query concatenated with the query-based summary and the document summary without query as the input of topic relevance model, which can help model learn the relevance degree between query and the core topic of document. Then, we utilize the language understanding and generation abilities of large language model (LLM) to rewrite and generate query from queries and documents in existing training data, which can construct new query-document pairs as training data. Extensive offline experiments and online A/B tests show that the proposed approaches effectively improve the performance of relevance modeling.

Via

Access Paper or Ask Questions