Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yongchang Li

CollabVLA: Self-Reflective Vision-Language-Action Model Dreaming Together with Human

Sep 18, 2025

Nan Sun, Yongchang Li, Chenxu Wang, Huiying Li, Huaping Liu

Figure 1 for CollabVLA: Self-Reflective Vision-Language-Action Model Dreaming Together with Human

Figure 2 for CollabVLA: Self-Reflective Vision-Language-Action Model Dreaming Together with Human

Figure 3 for CollabVLA: Self-Reflective Vision-Language-Action Model Dreaming Together with Human

Figure 4 for CollabVLA: Self-Reflective Vision-Language-Action Model Dreaming Together with Human

Abstract:In this work, we present CollabVLA, a self-reflective vision-language-action framework that transforms a standard visuomotor policy into a collaborative assistant. CollabVLA tackles key limitations of prior VLAs, including domain overfitting, non-interpretable reasoning, and the high latency of auxiliary generative models, by integrating VLM-based reflective reasoning with diffusion-based action generation under a mixture-of-experts design. Through a two-stage training recipe of action grounding and reflection tuning, it supports explicit self-reflection and proactively solicits human guidance when confronted with uncertainty or repeated failure. It cuts normalized Time by ~2x and Dream counts by ~4x vs. generative agents, achieving higher success rates, improved interpretability, and balanced low latency compared with existing methods. This work takes a pioneering step toward shifting VLAs from opaque controllers to genuinely assistive agents capable of reasoning, acting, and collaborating with humans.

* 8 pages, 5 figures, 3 tables

Via

Access Paper or Ask Questions

Knowledge Editing for Large Language Model with Knowledge Neuronal Ensemble

Dec 30, 2024

Yongchang Li, Yujin Zhu, Tao Yan, Shijian Fan, Gang Wu, Liang Xu

Figure 1 for Knowledge Editing for Large Language Model with Knowledge Neuronal Ensemble

Figure 2 for Knowledge Editing for Large Language Model with Knowledge Neuronal Ensemble

Figure 3 for Knowledge Editing for Large Language Model with Knowledge Neuronal Ensemble

Figure 4 for Knowledge Editing for Large Language Model with Knowledge Neuronal Ensemble

Abstract:As real-world knowledge is constantly evolving, ensuring the timeliness and accuracy of a model's knowledge is crucial. This has made knowledge editing in large language models increasingly important. However, existing knowledge editing methods face several challenges, including parameter localization coupling, imprecise localization, and a lack of dynamic interaction across layers. In this paper, we propose a novel knowledge editing method called Knowledge Neuronal Ensemble (KNE). A knowledge neuronal ensemble represents a group of neurons encoding specific knowledge, thus mitigating the issue of frequent parameter modification caused by coupling in parameter localization. The KNE method enhances the precision and accuracy of parameter localization by computing gradient attribution scores for each parameter at each layer. During the editing process, only the gradients and losses associated with the knowledge neuronal ensemble are computed, with error backpropagation performed accordingly, ensuring dynamic interaction and collaborative updates among parameters. Experimental results on three widely used knowledge editing datasets show that the KNE method significantly improves the accuracy of knowledge editing and achieves, or even exceeds, the performance of the best baseline methods in portability and locality metrics.

* 26 pages, 5 figures, 2 tables

Via

Access Paper or Ask Questions

AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment

Sep 26, 2024

Nan Sun, Bo Mao, Yongchang Li, Lumeng Ma, Di Guo, Huaping Liu

Figure 1 for AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment

Figure 2 for AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment

Figure 3 for AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment

Figure 4 for AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment

Abstract:The increasing demand for intelligent assistants in human-populated environments has motivated significant research in autonomous robotic systems. Traditional service robots and virtual assistants, however, struggle with real-world task execution due to their limited capacity for dynamic reasoning and interaction, particularly when human collaboration is required. Recent developments in Large Language Models have opened new avenues for improving these systems, enabling more sophisticated reasoning and natural interaction capabilities. In this paper, we introduce AssistantX, an LLM-powered proactive assistant designed to operate autonomously in a physical office environment. Unlike conventional service robots, AssistantX leverages a novel multi-agent architecture, PPDR4X, which provides advanced inference capabilities and comprehensive collaboration awareness. By effectively bridging the gap between virtual operations and physical interactions, AssistantX demonstrates robust performance in managing complex real-world scenarios. Our evaluation highlights the architecture's effectiveness, showing that AssistantX can respond to clear instructions, actively retrieve supplementary information from memory, and proactively seek collaboration from team members to ensure successful task completion. More details and videos can be found at https://assistantx-agent.github.io/AssistantX/.

* 6 pages, 8 figures, 4 tables

Via

Access Paper or Ask Questions