Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kairui Ding

DISCOVERSE: Efficient Robot Simulation in Complex High-Fidelity Environments

Jul 29, 2025

Yufei Jia, Guangyu Wang, Yuhang Dong, Junzhe Wu, Yupei Zeng, Haonan Lin, Zifan Wang, Haizhou Ge, Weibin Gu, Kairui Ding(+10 more)

Abstract:We present the first unified, modular, open-source 3DGS-based simulation framework for Real2Sim2Real robot learning. It features a holistic Real2Sim pipeline that synthesizes hyper-realistic geometry and appearance of complex real-world scenarios, paving the way for analyzing and bridging the Sim2Real gap. Powered by Gaussian Splatting and MuJoCo, Discoverse enables massively parallel simulation of multiple sensor modalities and accurate physics, with inclusive supports for existing 3D assets, robot models, and ROS plugins, empowering large-scale robot learning and complex robotic benchmarks. Through extensive experiments on imitation learning, Discoverse demonstrates state-of-the-art zero-shot Sim2Real transfer performance compared to existing simulators. For code and demos: https://air-discoverse.github.io/.

* 8pages, IROS2025 (Camera Ready)

Via

Access Paper or Ask Questions

Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving

Sep 10, 2024

Kairui Ding, Boyuan Chen, Yuchen Su, Huan-ang Gao, Bu Jin, Chonghao Sima, Wuqiang Zhang, Xiaohui Li, Paul Barsch, Hongyang Li(+1 more)

Figure 1 for Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving

Figure 2 for Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving

Figure 3 for Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving

Figure 4 for Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving

Abstract:End-to-end architectures in autonomous driving (AD) face a significant challenge in interpretability, impeding human-AI trust. Human-friendly natural language has been explored for tasks such as driving explanation and 3D captioning. However, previous works primarily focused on the paradigm of declarative interpretability, where the natural language interpretations are not grounded in the intermediate outputs of AD systems, making the interpretations only declarative. In contrast, aligned interpretability establishes a connection between language and the intermediate outputs of AD systems. Here we introduce Hint-AD, an integrated AD-language system that generates language aligned with the holistic perception-prediction-planning outputs of the AD model. By incorporating the intermediate outputs and a holistic token mixer sub-network for effective feature adaptation, Hint-AD achieves desirable accuracy, achieving state-of-the-art results in driving language tasks including driving explanation, 3D dense captioning, and command prediction. To facilitate further study on driving explanation task on nuScenes, we also introduce a human-labeled dataset, Nu-X. Codes, dataset, and models will be publicly available.

* CoRL 2024, Project Page: https://air-discover.github.io/Hint-AD/

Via

Access Paper or Ask Questions

PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

Apr 04, 2024

Kairui Ding, Boyuan Chen, Ruihai Wu, Yuyang Li, Zongzheng Zhang, Huan-ang Gao, Siqi Li, Yixin Zhu, Guyue Zhou, Hao Dong(+1 more)

Figure 1 for PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

Figure 2 for PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

Figure 3 for PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

Figure 4 for PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

Abstract:Robotic manipulation of ungraspable objects with two-finger grippers presents significant challenges due to the paucity of graspable features, while traditional pre-grasping techniques, which rely on repositioning objects and leveraging external aids like table edges, lack the adaptability across object categories and scenes. Addressing this, we introduce PreAfford, a novel pre-grasping planning framework that utilizes a point-level affordance representation and a relay training approach to enhance adaptability across a broad range of environments and object types, including those previously unseen. Demonstrated on the ShapeNet-v2 dataset, PreAfford significantly improves grasping success rates by 69% and validates its practicality through real-world experiments. This work offers a robust and adaptable solution for manipulating ungraspable objects.

* Project Page: https://air-discover.github.io/PreAfford/

Via

Access Paper or Ask Questions