Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shiwu Zhang

Traversing the Narrow Path: A Two-Stage Reinforcement Learning Framework for Humanoid Beam Walking

Aug 29, 2025

TianChen Huang, Wei Gao, Runchen Xu, Shiwu Zhang

Abstract:Traversing narrow beams is challenging for humanoids due to sparse, safety-critical contacts and the fragility of purely learned policies. We propose a physically grounded, two-stage framework that couples an XCoM/LIPM footstep template with a lightweight residual planner and a simple low-level tracker. Stage-1 is trained on flat ground: the tracker learns to robustly follow footstep targets by adding small random perturbations to heuristic footsteps, without any hand-crafted centerline locking, so it acquires stable contact scheduling and strong target-tracking robustness. Stage-2 is trained in simulation on a beam: a high-level planner predicts a body-frame residual (Delta x, Delta y, Delta psi) for the swing foot only, refining the template step to prioritize safe, precise placement under narrow support while preserving interpretability. To ease deployment, sensing is kept minimal and consistent between simulation and hardware: the planner consumes compact, forward-facing elevation cues together with onboard IMU and joint signals. On a Unitree G1, our system reliably traverses a 0.2 m-wide, 3 m-long beam. Across simulation and real-world studies, residual refinement consistently outperforms template-only and monolithic baselines in success rate, centerline adherence, and safety margins, while the structured footstep interface enables transparent analysis and low-friction sim-to-real transfer.

* Project website: https://huangtc233.github.io/Traversing-the-Narrow-Path/

Via

Access Paper or Ask Questions

High-Precision Transformer-Based Visual Servoing for Humanoid Robots in Aligning Tiny Objects

Mar 06, 2025

Jialong Xue, Wei Gao, Yu Wang, Chao Ji, Dongdong Zhao, Shi Yan, Shiwu Zhang

Figure 1 for High-Precision Transformer-Based Visual Servoing for Humanoid Robots in Aligning Tiny Objects

Figure 2 for High-Precision Transformer-Based Visual Servoing for Humanoid Robots in Aligning Tiny Objects

Figure 3 for High-Precision Transformer-Based Visual Servoing for Humanoid Robots in Aligning Tiny Objects

Figure 4 for High-Precision Transformer-Based Visual Servoing for Humanoid Robots in Aligning Tiny Objects

Abstract:High-precision tiny object alignment remains a common and critical challenge for humanoid robots in real-world. To address this problem, this paper proposes a vision-based framework for precisely estimating and controlling the relative position between a handheld tool and a target object for humanoid robots, e.g., a screwdriver tip and a screw head slot. By fusing images from the head and torso cameras on a robot with its head joint angles, the proposed Transformer-based visual servoing method can correct the handheld tool's positional errors effectively, especially at a close distance. Experiments on M4-M8 screws demonstrate an average convergence error of 0.8-1.3 mm and a success rate of 93\%-100\%. Through comparative analysis, the results validate that this capability of high-precision tiny object alignment is enabled by the Distance Estimation Transformer architecture and the Multi-Perception-Head mechanism proposed in this paper.

* for associated video, see https://b23.tv/cklF7aK

Via

Access Paper or Ask Questions

A Real-time Non-contact Localization Method for Faulty Electric Energy Storage Components using Highly Sensitive Magnetometers

Aug 15, 2023

Tonghui Peng, Wei Gao, Ya Wu, Yulong Ma, Shiwu Zhang, Yinan Hu

Figure 1 for A Real-time Non-contact Localization Method for Faulty Electric Energy Storage Components using Highly Sensitive Magnetometers

Figure 2 for A Real-time Non-contact Localization Method for Faulty Electric Energy Storage Components using Highly Sensitive Magnetometers

Figure 3 for A Real-time Non-contact Localization Method for Faulty Electric Energy Storage Components using Highly Sensitive Magnetometers

Figure 4 for A Real-time Non-contact Localization Method for Faulty Electric Energy Storage Components using Highly Sensitive Magnetometers

Abstract:With the wide application of electric energy storage component arrays, such as battery arrays, capacitor arrays, inductor arrays, their potential safety risks have gradually drawn the public attention. However, existing technologies cannot meet the needs of non-contact and real-time diagnosis for faulty components inside these massive arrays. To solve this problem, this paper proposes a new method based on the beamforming spatial filtering algorithm to precisely locate the faulty components within the arrays in real-time. The method uses highly sensitive magnetometers to collect the magnetic signals from energy storage component arrays, without damaging or even contacting any component. The experimental results demonstrate the potential of the proposed method in securing energy storage component arrays. Within an imaging area of 80 mm $\times$ 80 mm, the one faulty component out of nine total components can be localized with an accuracy of 0.72 mm for capacitor arrays and 1.60 mm for battery arrays.

Via

Access Paper or Ask Questions

Active View Planning for Visual SLAM in Outdoor Environments Based on Continuous Information Modeling

Nov 12, 2022

Zhihao Wang, Haoyao Chen, Shiwu Zhang, Yunjiang Lou

Figure 1 for Active View Planning for Visual SLAM in Outdoor Environments Based on Continuous Information Modeling

Figure 2 for Active View Planning for Visual SLAM in Outdoor Environments Based on Continuous Information Modeling

Figure 3 for Active View Planning for Visual SLAM in Outdoor Environments Based on Continuous Information Modeling

Figure 4 for Active View Planning for Visual SLAM in Outdoor Environments Based on Continuous Information Modeling

Abstract:The visual simultaneous localization and mapping(vSLAM) is widely used in GPS-denied and open field environments for ground and surface robots. However, due to the frequent perception failures derived from lacking visual texture or the {swing} of robot view direction on rough terrains, the accuracy and robustness of vSLAM are still to be enhanced. The study develops a novel view planning approach of actively perceiving areas with maximal information to address the mentioned problem; a gimbal camera is used as the main sensor. Firstly, a map representation based on feature distribution-weighted Fisher information is proposed to completely and effectively represent environmental information richness. With the map representation, a continuous environmental information model is further established to convert the discrete information space into a continuous one for numerical optimization in real-time. Subsequently, the receding horizon optimization is utilized to obtain the optimal informative viewpoints with simultaneously considering the robotic perception, exploration and motion cost based on the continuous environmental model. Finally, several simulations and outdoor experiments are performed to verify the improvement of localization robustness and accuracy by the proposed approach.

* 11 pages, 14 figures

Via

Access Paper or Ask Questions

Edge-based Monocular Thermal-Inertial Odometry in Visually Degraded Environments

Oct 18, 2022

Yu Wang, Haoyao Chen, Yufeng Liu, Shiwu Zhang

Figure 1 for Edge-based Monocular Thermal-Inertial Odometry in Visually Degraded Environments

Figure 2 for Edge-based Monocular Thermal-Inertial Odometry in Visually Degraded Environments

Figure 3 for Edge-based Monocular Thermal-Inertial Odometry in Visually Degraded Environments

Figure 4 for Edge-based Monocular Thermal-Inertial Odometry in Visually Degraded Environments

Abstract:State estimation in complex illumination environments based on conventional visual-inertial odometry is a challenging task due to the severe visual degradation of the visual camera. The thermal infrared camera is capable of all-day time and is less affected by illumination variation. However, most existing visual data association algorithms are incompatible because the thermal infrared data contains large noise and low contrast. Motivated by the phenomenon that thermal radiation varies most significantly at the edges of objects, the study proposes an ETIO, which is the first edge-based monocular thermal-inertial odometry for robust localization in visually degraded environments. Instead of the raw image, we utilize the binarized image from edge extraction for pose estimation to overcome the poor thermal infrared image quality. Then, an adaptive feature tracking strategy ADT-KLT is developed for robust data association based on limited edge information and its distance distribution. Finally, a pose graph optimization performs real-time estimation over a sliding window of recent states by combining IMU pre-integration with reprojection error of all edge feature observations. We evaluated the performance of the proposed system on public datasets and real-world experiments and compared it against state-of-the-art methods. The proposed ETIO was verified with the ability to enable accurate and robust localization all-day time.

* 8 pages, 10 figures,

Via

Access Paper or Ask Questions