Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Manith Adikari

Grounding Hierarchical Vision-Language-Action Models Through Explicit Language-Action Alignment

Apr 07, 2026

Theodor Wulff, Federico Tavella, Rahul Singh Maharjan, Manith Adikari, Angelo Cangelosi

Abstract:Achieving robot transparency is a critical step toward effective human-robot collaboration. To be transparent, a robot's natural language communication must be consistent with its actions and explicitly grounded in the task and environment. Existing hierarchical Vision-Language-Action (VLA) models can generate language (e.g., through chain-of-thought) and low-level actions. However, current work does not consider explicit alignment between these modalities during training. To address this crucial gap, we propose a novel training framework that explicitly grounds hierarchical VLA sub-task descriptions with respect to the visual observation and action space. Our framework uses a contrastive model to assess the alignment between generated language and corresponding action trajectories. This contrastive model enables direct ranking of different language-trajectory pairs based on their alignment, allowing us to refine the grounding of our hierarchical VLA through offline preference learning. We apply our framework to the LanguageTable dataset, a benchmark dataset of human language-annotated trajectories, and provide critical insights into multimodal grounding representations, all while establishing a strong baseline that achieves performance comparable to fully supervised fine-tuning and minimizing the need for costly data annotations.

Via

Access Paper or Ask Questions

Social Robot Mediator for Multiparty Interaction

Oct 20, 2023

Manith Adikari, Angelo Cangelosi, Randy Gomez

Figure 1 for Social Robot Mediator for Multiparty Interaction

Figure 2 for Social Robot Mediator for Multiparty Interaction

Figure 3 for Social Robot Mediator for Multiparty Interaction

Abstract:A social robot acting as a 'mediator' can enhance interactions between humans, for example, in fields such as education and healthcare. A particularly promising area of research is the use of a social robot mediator in a multiparty setting, which tends to be the most applicable in real-world scenarios. However, research in social robot mediation for multiparty interactions is still emerging and faces numerous challenges. This paper provides an overview of social robotics and mediation research by highlighting relevant literature and some of the ongoing problems. The importance of incorporating relevant psychological principles for developing social robot mediators is also presented. Additionally, the potential of implementing a Theory of Mind in a social robot mediator is explored, given how such a framework could greatly improve mediation by reading the individual and group mental states to interact effectively.

* 2023 IEEE International Conference on Robotics and Automation (ICRA 2023) Workshop Towards a Balanced Cyberphysical Society: A Focus on Group Social Dynamics

Via

Access Paper or Ask Questions