Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Youmin Gong

ISS Policy : Scalable Diffusion Policy with Implicit Scene Supervision

Dec 17, 2025

Wenlong Xia, Jinhao Zhang, Ce Zhang, Yaojia Wang, Youmin Gong, Jie Mei

Abstract:Vision-based imitation learning has enabled impressive robotic manipulation skills, but its reliance on object appearance while ignoring the underlying 3D scene structure leads to low training efficiency and poor generalization. To address these challenges, we introduce \emph{Implicit Scene Supervision (ISS) Policy}, a 3D visuomotor DiT-based diffusion policy that predicts sequences of continuous actions from point cloud observations. We extend DiT with a novel implicit scene supervision module that encourages the model to produce outputs consistent with the scene's geometric evolution, thereby improving the performance and robustness of the policy. Notably, ISS Policy achieves state-of-the-art performance on both single-arm manipulation tasks (MetaWorld) and dexterous hand manipulation (Adroit). In real-world experiments, it also demonstrates strong generalization and robustness. Additional ablation studies show that our method scales effectively with both data and parameters. Code and videos will be released.

Via

Access Paper or Ask Questions

Agile in the Face of Delay: Asynchronous End-to-End Learning for Real-World Aerial Navigation

Sep 17, 2025

Yude Li, Zhexuan Zhou, Huizhe Li, Youmin Gong, Jie Mei

Figure 1 for Agile in the Face of Delay: Asynchronous End-to-End Learning for Real-World Aerial Navigation

Figure 2 for Agile in the Face of Delay: Asynchronous End-to-End Learning for Real-World Aerial Navigation

Figure 3 for Agile in the Face of Delay: Asynchronous End-to-End Learning for Real-World Aerial Navigation

Figure 4 for Agile in the Face of Delay: Asynchronous End-to-End Learning for Real-World Aerial Navigation

Abstract:Robust autonomous navigation for Autonomous Aerial Vehicles (AAVs) in complex environments is a critical capability. However, modern end-to-end navigation faces a key challenge: the high-frequency control loop needed for agile flight conflicts with low-frequency perception streams, which are limited by sensor update rates and significant computational cost. This mismatch forces conventional synchronous models into undesirably low control rates. To resolve this, we propose an asynchronous reinforcement learning framework that decouples perception and control, enabling a high-frequency policy to act on the latest IMU state for immediate reactivity, while incorporating perception features asynchronously. To manage the resulting data staleness, we introduce a theoretically-grounded Temporal Encoding Module (TEM) that explicitly conditions the policy on perception delays, a strategy complemented by a two-stage curriculum to ensure stable and efficient training. Validated in extensive simulations, our method was successfully deployed in zero-shot sim-to-real transfer on an onboard NUC, where it sustains a 100~Hz control rate and demonstrates robust, agile navigation in cluttered real-world environments. Our source code will be released for community reference.

Via

Access Paper or Ask Questions

STORM: Spatial-Temporal Iterative Optimization for Reliable Multicopter Trajectory Generation

Mar 05, 2025

Jinhao Zhang, Zhexuan Zhou, Wenlong Xia, Youmin Gong, Jie Mei

Figure 1 for STORM: Spatial-Temporal Iterative Optimization for Reliable Multicopter Trajectory Generation

Figure 2 for STORM: Spatial-Temporal Iterative Optimization for Reliable Multicopter Trajectory Generation

Figure 3 for STORM: Spatial-Temporal Iterative Optimization for Reliable Multicopter Trajectory Generation

Figure 4 for STORM: Spatial-Temporal Iterative Optimization for Reliable Multicopter Trajectory Generation

Abstract:Efficient and safe trajectory planning plays a critical role in the application of quadrotor unmanned aerial vehicles. Currently, the inherent trade-off between constraint compliance and computational efficiency enhancement in UAV trajectory optimization problems has not been sufficiently addressed. To enhance the performance of UAV trajectory optimization, we propose a spatial-temporal iterative optimization framework. Firstly, B-splines are utilized to represent UAV trajectories, with rigorous safety assurance achieved through strict enforcement of constraints on control points. Subsequently, a set of QP-LP subproblems via spatial-temporal decoupling and constraint linearization is derived. Finally, an iterative optimization strategy incorporating guidance gradients is employed to obtain high-performance UAV trajectories in different scenarios. Both simulation and real-world experimental results validate the efficiency and high-performance of the proposed optimization framework in generating safe and fast trajectories. Our source codes will be released for community reference at https://hitsz-mas.github.io/STORM

Via

Access Paper or Ask Questions