Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mikita Miyaguchi

Adaptive Policy Switching of Two-Wheeled Differential Robots for Traversing over Diverse Terrains

Mar 05, 2026

Haruki Izawa, Takeshi Takai, Shingo Kitano, Mikita Miyaguchi, Hiroaki Kawashima

Abstract:Exploring lunar lava tubes requires robots to traverse without human intervention. Because pre-trained policies cannot fully cover all possible terrain conditions, our goal is to enable adaptive policy switching, where the robot selects an appropriate terrain-specialized model based on its current terrain features. This study investigates whether terrain types can be estimated effectively using posture-related observations collected during navigation. We fine-tuned a pre-trained policy using Proximal Policy Optimization (PPO), and then collected the robot's 3D orientation data as it moved across flat and rough terrain in a simulated lava-tube environment. Our analysis revealed that the standard deviation of the robot's pitch data shows a clear difference between these two terrain types. Using Gaussian mixture models (GMM), we evaluated terrain classification across various window sizes. An accuracy of more than 98% was achieved when using a 70-step window. The result suggests that short-term orientation data are sufficient for reliable terrain estimation, providing a foundation for adaptive policy switching.

* Proc. of the Joint Symposium of AROB 31st and ISBC 11th (AROB-ISBC 2026), pp. 787-792, 2026
* Author's version of the paper presented at AROB-ISBC 2026

Via

Access Paper or Ask Questions

LLM-Guided Decentralized Exploration with Self-Organizing Robot Teams

Mar 05, 2026

Hiroaki Kawashima, Shun Ikejima, Takeshi Takai, Mikita Miyaguchi, Yasuharu Kunii

Abstract:When individual robots have limited sensing capabilities or insufficient fault tolerance, it becomes necessary for multiple robots to form teams during exploration, thereby increasing the collective observation range and reliability. Traditionally, swarm formation has often been managed by a central controller; however, from the perspectives of robustness and flexibility, it is preferable for the swarm to operate autonomously even in the absence of centralized control. In addition, the determination of exploration targets for each team is crucial for efficient exploration in such multi-team exploration scenarios. This study therefore proposes an exploration method that combines (1) an algorithm for self-organization, enabling the autonomous and dynamic formation of multiple teams, and (2) an algorithm that allows each team to autonomously determine its next exploration target (destination). In particular, for (2), this study explores a novel strategy based on large language models (LLMs), while classical frontier-based methods and deep reinforcement learning approaches have been widely studied. The effectiveness of the proposed method was validated through simulations involving tens to hundreds of robots.

* Proc. of the Joint Symposium of AROB 31st and ISBC 11th (AROB-ISBC 2026), pp. 923-927, 2026
* Author's version of the paper presented at AROB-ISBC 2026

Via

Access Paper or Ask Questions