Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

He-Yang Xu

AnchorDP3: 3D Affordance Guided Sparse Diffusion Policy for Robotic Manipulation

Jun 24, 2025

Ziyan Zhao, Ke Fan, He-Yang Xu, Ning Qiao, Bo Peng, Wenlong Gao, Dongjiang Li, Hui Shen

Figure 1 for AnchorDP3: 3D Affordance Guided Sparse Diffusion Policy for Robotic Manipulation

Figure 2 for AnchorDP3: 3D Affordance Guided Sparse Diffusion Policy for Robotic Manipulation

Figure 3 for AnchorDP3: 3D Affordance Guided Sparse Diffusion Policy for Robotic Manipulation

Figure 4 for AnchorDP3: 3D Affordance Guided Sparse Diffusion Policy for Robotic Manipulation

Abstract:We present AnchorDP3, a diffusion policy framework for dual-arm robotic manipulation that achieves state-of-the-art performance in highly randomized environments. AnchorDP3 integrates three key innovations: (1) Simulator-Supervised Semantic Segmentation, using rendered ground truth to explicitly segment task-critical objects within the point cloud, which provides strong affordance priors; (2) Task-Conditioned Feature Encoders, lightweight modules processing augmented point clouds per task, enabling efficient multi-task learning through a shared diffusion-based action expert; (3) Affordance-Anchored Keypose Diffusion with Full State Supervision, replacing dense trajectory prediction with sparse, geometrically meaningful action anchors, i.e., keyposes such as pre-grasp pose, grasp pose directly anchored to affordances, drastically simplifying the prediction space; the action expert is forced to predict both robot joint angles and end-effector poses simultaneously, which exploits geometric consistency to accelerate convergence and boost accuracy. Trained on large-scale, procedurally generated simulation data, AnchorDP3 achieves a 98.7% average success rate in the RoboTwin benchmark across diverse tasks under extreme randomization of objects, clutter, table height, lighting, and backgrounds. This framework, when integrated with the RoboTwin real-to-sim pipeline, has the potential to enable fully autonomous generation of deployable visuomotor policies from only scene and instruction, totally eliminating human demonstrations from learning manipulation skills.

Via

Access Paper or Ask Questions

An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning

Sep 28, 2022

Xiu-Shen Wei, He-Yang Xu, Faen Zhang, Yuxin Peng, Wei Zhou

Figure 1 for An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning

Figure 2 for An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning

Figure 3 for An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning

Figure 4 for An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning

Abstract:Semi-supervised few-shot learning consists in training a classifier to adapt to new tasks with limited labeled data and a fixed quantity of unlabeled data. Many sophisticated methods have been developed to address the challenges this problem comprises. In this paper, we propose a simple but quite effective approach to predict accurate negative pseudo-labels of unlabeled data from an indirect learning perspective, and then augment the extremely label-constrained support set in few-shot classification tasks. Our approach can be implemented in just few lines of code by only using off-the-shelf operations, yet it is able to outperform state-of-the-art methods on four benchmark datasets.

* Accepted by NeurIPS 2022

Via

Access Paper or Ask Questions