Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ryo Kurazume

Multi-modal panoramic 3D outdoor datasets for place categorization

Apr 14, 2026

Hojung Jung, Yuki Oto, Oscar M. Mozos, Yumi Iwashita, Ryo Kurazume

Abstract:We present two multi-modal panoramic 3D outdoor (MPO) datasets for semantic place categorization with six categories: forest, coast, residential area, urban area and indoor/outdoor parking lot. The first dataset consists of 650 static panoramic scans of dense (9,000,000 points) 3D color and reflectance point clouds obtained using a FARO laser scanner with synchronized color images. The second dataset consists of 34,200 real-time panoramic scans of sparse (70,000 points) 3D reflectance point clouds obtained using a Velodyne laser scanner while driving a car. The datasets were obtained in the city of Fukuoka, Japan and are publicly available in [1], [2]. In addition, we compare several approaches for semantic place categorization with best results of 96.42% (dense) and 89.67% (sparse).

* This is the authors' manuscript. The final published article was presented at IROS 2026, and it is available at https://doi.org/10.1109/IROS.2016.7759669

Via

Access Paper or Ask Questions

Incremental Residual Reinforcement Learning Toward Real-World Learning for Social Navigation

Apr 09, 2026

Haruto Nagahisa, Kohei Matsumoto, Yuki Tomita, Yuki Hyodo, Ryo Kurazume

Abstract:As the demand for mobile robots continues to increase, social navigation has emerged as a critical task, driving active research into deep reinforcement learning (RL) approaches. However, because pedestrian dynamics and social conventions vary widely across different regions, simulations cannot easily encompass all possible real-world scenarios. Real-world RL, in which agents learn while operating directly in physical environments, presents a promising solution to this issue. Nevertheless, this approach faces significant challenges, particularly regarding constrained computational resources on edge devices and learning efficiency. In this study, we propose incremental residual RL (IRRL). This method integrates incremental learning, which is a lightweight process that operates without a replay buffer or batch updates, with residual RL, which enhances learning efficiency by training only on the residuals relative to a base policy. Through the simulation experiments, we demonstrated that, despite lacking a replay buffer, IRRL achieved performance comparable to those of conventional replay buffer-based methods and outperformed existing incremental learning approaches. Furthermore, the real-world experiments confirmed that IRRL can enable robots to effectively adapt to previously unseen environments through the real-world learning.

Via

Access Paper or Ask Questions

DRUM: Diffusion-based Raydrop-aware Unpaired Mapping for Sim2Real LiDAR Segmentation

Mar 27, 2026

Tomoya Miyawaki, Kazuto Nakashima, Yumi Iwashita, Ryo Kurazume

Abstract:LiDAR-based semantic segmentation is a key component for autonomous mobile robots, yet large-scale annotation of LiDAR point clouds is prohibitively expensive and time-consuming. Although simulators can provide labeled synthetic data, models trained on synthetic data often underperform on real-world data due to a data-level domain gap. To address this issue, we propose DRUM, a novel Sim2Real translation framework. We leverage a diffusion model pre-trained on unlabeled real-world data as a generative prior and translate synthetic data by reproducing two key measurement characteristics: reflectance intensity and raydrop noise. To improve sample fidelity, we introduce a raydrop-aware masked guidance mechanism that selectively enforces consistency with the input synthetic data while preserving realistic raydrop noise induced by the diffusion prior. Experimental results demonstrate that DRUM consistently improves Sim2Real performance across multiple representations of LiDAR data. The project page is available at https://miya-tomoya.github.io/drum.

* ICRA 2026

Via

Access Paper or Ask Questions

Learning Geometric and Photometric Features from Panoramic LiDAR Scans for Outdoor Place Categorization

Mar 13, 2026

Kazuto Nakashima, Hojung Jung, Yuki Oto, Yumi Iwashita, Ryo Kurazume, Oscar Martinez Mozos

Abstract:Semantic place categorization, which is one of the essential tasks for autonomous robots and vehicles, allows them to have capabilities of self-decision and navigation in unfamiliar environments. In particular, outdoor places are more difficult targets than indoor ones due to perceptual variations, such as dynamic illuminance over twenty-four hours and occlusions by cars and pedestrians. This paper presents a novel method of categorizing outdoor places using convolutional neural networks (CNNs), which take omnidirectional depth/reflectance images obtained by 3D LiDARs as the inputs. First, we construct a large-scale outdoor place dataset named Multi-modal Panoramic 3D Outdoor (MPO) comprising two types of point clouds captured by two different LiDARs. They are labeled with six outdoor place categories: coast, forest, indoor/outdoor parking, residential area, and urban area. Second, we provide CNNs for LiDAR-based outdoor place categorization and evaluate our approach with the MPO dataset. Our results on the MPO dataset outperform traditional approaches and show the effectiveness in which we use both depth and reflectance modalities. To analyze our trained deep networks we visualize the learned features.

* Advanced Robotics, 32(14):750-765, 2018
* Published in Advanced Robotics on 31 Jul 2018

Via

Access Paper or Ask Questions

FusionNet: a frame interpolation network for 4D heart models

Mar 10, 2026

Chujie Chang, Shoko Miyauchi, Ken'ichi Morooka, Ryo Kurazume, Oscar Martinez Mozos

Abstract:Cardiac magnetic resonance (CMR) imaging is widely used to visualise cardiac motion and diagnose heart disease. However, standard CMR imaging requires patients to lie still in a confined space inside a loud machine for 40-60 min, which increases patient discomfort. In addition, shorter scan times decrease either or both the temporal and spatial resolutions of cardiac motion, and thus, the diagnostic accuracy of the procedure. Of these, we focus on reduced temporal resolution and propose a neural network called FusionNet to obtain four-dimensional (4D) cardiac motion with high temporal resolution from CMR images captured in a short period of time. The model estimates intermediate 3D heart shapes based on adjacent shapes. The results of an experimental evaluation of the proposed FusionNet model showed that it achieved a performance of over 0.897 in terms of the Dice coefficient, confirming that it can recover shapes more precisely than existing methods. This code is available at: https://github.com/smiyauchi199/FusionNet.git

* Medical Image Computing and Computer Assisted Intervention - MICCAI 2023 Workshops. MICCAI 2023. Lecture Notes in Computer Science, vol 14394. Springer, Cham
* This is the authors' version. The final authenticated version is available online at https://doi.org/10.1007/978-3-031-47425-5_4. Published in Medical Image Computing and Computer Assisted Intervention - MICCAI 2023 Workshops

Via

Access Paper or Ask Questions

LLM-Based Behavior Tree Generation for Construction Machinery

Feb 01, 2026

Akinosuke Tsutsumi, Tomoya Itsuka, Yuichiro Kasahara, Tomoya Kouno, Kota Akinari, Genki Yamauchi, Daisuke Endo, Taro Abe, Takeshi Hashimoto, Keiji Nagatani(+1 more)

Abstract:Earthwork operations are facing an increasing demand, while workforce aging and skill loss create a pressing need for automation. ROS2-TMS for Construction, a Cyber-Physical System framework designed to coordinate construction machinery, has been proposed for autonomous operation; however, its reliance on manually designed Behavior Trees (BTs) limits scalability, particularly in scenarios involving heterogeneous machine cooperation. Recent advances in large language models (LLMs) offer new opportunities for task planning and BT generation. However, most existing approaches remain confined to simulations or simple manipulators, with relatively few applications demonstrated in real-world contexts, such as complex construction sites involving multiple machines. This paper proposes an LLM-based workflow for BT generation, introducing synchronization flags to enable safe and cooperative operation. The workflow consists of two steps: high-level planning, where the LLM generates synchronization flags, and BT generation using structured templates. Safety is ensured by planning with parameters stored in the system database. The proposed method is validated in simulation and further demonstrated through real-world experiments, highlighting its potential to advance automation in civil engineering.

* 7 pages, 7 figures

Via

Access Paper or Ask Questions

COLSON: Controllable Learning-Based Social Navigation via Diffusion-Based Reinforcement Learning

Mar 18, 2025

Yuki Tomita, Kohei Matsumoto, Yuki Hyodo, Ryo Kurazume

Figure 1 for COLSON: Controllable Learning-Based Social Navigation via Diffusion-Based Reinforcement Learning

Figure 2 for COLSON: Controllable Learning-Based Social Navigation via Diffusion-Based Reinforcement Learning

Figure 3 for COLSON: Controllable Learning-Based Social Navigation via Diffusion-Based Reinforcement Learning

Figure 4 for COLSON: Controllable Learning-Based Social Navigation via Diffusion-Based Reinforcement Learning

Abstract:Mobile robot navigation in dynamic environments with pedestrian traffic is a key challenge in the development of autonomous mobile service robots. Recently, deep reinforcement learning-based methods have been actively studied and have outperformed traditional rule-based approaches owing to their optimization capabilities. Among these, methods that assume a continuous action space typically rely on a Gaussian distribution assumption, which limits the flexibility of generated actions. Meanwhile, the application of diffusion models to reinforcement learning has advanced, allowing for more flexible action distributions compared with Gaussian distribution-based approaches. In this study, we applied a diffusion-based reinforcement learning approach to social navigation and validated its effectiveness. Furthermore, by leveraging the characteristics of diffusion models, we propose an extension that enables post-training action smoothing and adaptation to static obstacle scenarios not considered during the training steps.

* This work has been submitted to IROS 2025 for possible publication

Via

Access Paper or Ask Questions

Fast LiDAR Data Generation with Rectified Flows

Dec 03, 2024

Kazuto Nakashima, Xiaowen Liu, Tomoya Miyawaki, Yumi Iwashita, Ryo Kurazume

Figure 1 for Fast LiDAR Data Generation with Rectified Flows

Figure 2 for Fast LiDAR Data Generation with Rectified Flows

Figure 3 for Fast LiDAR Data Generation with Rectified Flows

Figure 4 for Fast LiDAR Data Generation with Rectified Flows

Abstract:Building LiDAR generative models holds promise as powerful data priors for restoration, scene manipulation, and scalable simulation in autonomous mobile robots. In recent years, approaches using diffusion models have emerged, significantly improving training stability and generation quality. Despite the success of diffusion models, generating high-quality samples requires numerous iterations of running neural networks, and the increasing computational cost can pose a barrier to robotics applications. To address this challenge, this paper presents R2Flow, a fast and high-fidelity generative model for LiDAR data. Our method is based on rectified flows that learn straight trajectories, simulating data generation with much fewer sampling steps against diffusion models. We also propose a efficient Transformer-based model architecture for processing the image representation of LiDAR range and reflectance measurements. Our experiments on the unconditional generation of the KITTI-360 dataset demonstrate the effectiveness of our approach in terms of both efficiency and quality.

Via

Access Paper or Ask Questions

Development of CPS Platform for Autonomous Construction

Nov 29, 2024

Yuichiro Kasahara, Kota Akinari, Tomoya Kouno, Noriko Sano, Taro Abe, Genki Yamauchi, Daisuke Endo, Takeshi Hashimoto, Keiji Nagatani, Ryo Kurazume

Figure 1 for Development of CPS Platform for Autonomous Construction

Figure 2 for Development of CPS Platform for Autonomous Construction

Figure 3 for Development of CPS Platform for Autonomous Construction

Figure 4 for Development of CPS Platform for Autonomous Construction

Abstract:In recent years, labor shortages due to the declining birthrate and aging population have become significant challenges at construction sites in developed countries, including Japan. To address these challenges, we are developing an open platform called ROS2-TMS for Construction, a Cyber-Physical System (CPS) for construction sites, to achieve both efficiency and safety in earthwork operations. In ROS2-TMS for Construction, the system comprehensively collects and stores environmental information from sensors placed throughout the construction site. Based on these data, a real-time virtual construction site is created in cyberspace. Then, based on the state of construction machinery and environmental conditions in cyberspace, the optimal next actions for actual construction machinery are determined, and the construction machinery is operated accordingly. In this project, we decided to use the Open Platform for Earthwork with Robotics and Autonomy (OPERA), developed by the Public Works Research Institute (PWRI) in Japan, to control construction machinery from ROS2-TMS for Construction with an originally extended behavior tree. In this study, we present an overview of OPERA, focusing on the newly developed navigation package for operating the crawler dump, as well as the overall structure of ROS2-TMS for Construction as a Cyber-Physical System (CPS). Additionally, we conducted experiments using a crawler dump and a backhoe to verify the aforementioned functionalities.

Via

Access Paper or Ask Questions

Gait Sequence Upsampling using Diffusion Models for single LiDAR sensors

Oct 11, 2024

Jeongho Ahn, Kazuto Nakashima, Koki Yoshino, Yumi Iwashita, Ryo Kurazume

Figure 1 for Gait Sequence Upsampling using Diffusion Models for single LiDAR sensors

Figure 2 for Gait Sequence Upsampling using Diffusion Models for single LiDAR sensors

Figure 3 for Gait Sequence Upsampling using Diffusion Models for single LiDAR sensors

Figure 4 for Gait Sequence Upsampling using Diffusion Models for single LiDAR sensors

Abstract:Recently, 3D LiDAR has emerged as a promising technique in the field of gait-based person identification, serving as an alternative to traditional RGB cameras, due to its robustness under varying lighting conditions and its ability to capture 3D geometric information. However, long capture distances or the use of low-cost LiDAR sensors often result in sparse human point clouds, leading to a decline in identification performance. To address these challenges, we propose a sparse-to-dense upsampling model for pedestrian point clouds in LiDAR-based gait recognition, named LidarGSU, which is designed to improve the generalization capability of existing identification models. Our method utilizes diffusion probabilistic models (DPMs), which have shown high fidelity in generative tasks such as image completion. In this work, we leverage DPMs on sparse sequential pedestrian point clouds as conditional masks in a video-to-video translation approach, applied in an inpainting manner. We conducted extensive experiments on the SUSTeck1K dataset to evaluate the generative quality and recognition performance of the proposed method. Furthermore, we demonstrate the applicability of our upsampling model using a real-world dataset, captured with a low-resolution sensor across varying measurement distances.

Via

Access Paper or Ask Questions