Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

David Held

Mesh-based Dynamics with Occlusion Reasoning for Cloth Manipulation

Jun 06, 2022

Zixuan Huang, Xingyu Lin, David Held

Figure 1 for Mesh-based Dynamics with Occlusion Reasoning for Cloth Manipulation

Figure 2 for Mesh-based Dynamics with Occlusion Reasoning for Cloth Manipulation

Figure 3 for Mesh-based Dynamics with Occlusion Reasoning for Cloth Manipulation

Figure 4 for Mesh-based Dynamics with Occlusion Reasoning for Cloth Manipulation

Abstract:Self-occlusion is challenging for cloth manipulation, as it makes it difficult to estimate the full state of the cloth. Ideally, a robot trying to unfold a crumpled or folded cloth should be able to reason about the cloth's occluded regions. We leverage recent advances in pose estimation for cloth to build a system that uses explicit occlusion reasoning to unfold a crumpled cloth. Specifically, we first learn a model to reconstruct the mesh of the cloth. However, the model will likely have errors due to the complexities of the cloth configurations and due to ambiguities from occlusions. Our main insight is that we can further refine the predicted reconstruction by performing test-time finetuning with self-supervised losses. The obtained reconstructed mesh allows us to use a mesh-based dynamics model for planning while reasoning about occlusions. We evaluate our system both on cloth flattening as well as on cloth canonicalization, in which the objective is to manipulate the cloth into a canonical pose. Our experiments show that our method significantly outperforms prior methods that do not explicitly account for occlusions or perform test-time optimization.

* RSS 2022

Via

Access Paper or Ask Questions

FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects

May 09, 2022

Ben Eisner, Harry Zhang, David Held

Figure 1 for FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects

Figure 2 for FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects

Figure 3 for FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects

Figure 4 for FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects

Abstract:We explore a novel method to perceive and manipulate 3D articulated objects that generalizes to enable a robot to articulate unseen classes of objects. We propose a vision-based system that learns to predict the potential motions of the parts of a variety of articulated objects to guide downstream motion planning of the system to articulate the objects. To predict the object motions, we train a neural network to output a dense vector field representing the point-wise motion direction of the points in the point cloud under articulation. We then deploy an analytical motion planner based on this vector field to achieve a policy that yields maximum articulation. We train the vision system entirely in simulation, and we demonstrate the capability of our system to generalize to unseen object instances and novel categories in both simulation and the real world, deploying our policy on a Sawyer robot with no finetuning. Results show that our system achieves state-of-the-art performance in both simulated and real-world experiments.

* Accepted to Robotics Science and Systems (RSS) 2022

Via

Access Paper or Ask Questions

DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

Mar 31, 2022

Xingyu Lin, Zhiao Huang, Yunzhu Li, Joshua B. Tenenbaum, David Held, Chuang Gan

Figure 1 for DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

Figure 2 for DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

Figure 3 for DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

Figure 4 for DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

Abstract:We consider the problem of sequential robotic manipulation of deformable objects using tools. Previous works have shown that differentiable physics simulators provide gradients to the environment state and help trajectory optimization to converge orders of magnitude faster than model-free reinforcement learning algorithms for deformable object manipulation. However, such gradient-based trajectory optimization typically requires access to the full simulator states and can only solve short-horizon, single-skill tasks due to local optima. In this work, we propose a novel framework, named DiffSkill, that uses a differentiable physics simulator for skill abstraction to solve long-horizon deformable object manipulation tasks from sensory observations. In particular, we first obtain short-horizon skills using individual tools from a gradient-based optimizer, using the full state information in a differentiable simulator; we then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input. Finally, we plan over the skills by finding the intermediate goals and then solve long-horizon tasks. We show the advantages of our method in a new set of sequential deformable object manipulation tasks compared to previous reinforcement learning algorithms and compared to the trajectory optimizer.

* ICLR 2022. Project page: https://xingyu-lin.github.io/diffskill/

Via

Access Paper or Ask Questions

RB2: Robotic Manipulation Benchmarking with a Twist

Mar 15, 2022

Sudeep Dasari, Jianren Wang, Joyce Hong, Shikhar Bahl, Yixin Lin, Austin Wang, Abitha Thankaraj, Karanbir Chahal, Berk Calli, Saurabh Gupta(+5 more)

Figure 1 for RB2: Robotic Manipulation Benchmarking with a Twist

Figure 2 for RB2: Robotic Manipulation Benchmarking with a Twist

Figure 3 for RB2: Robotic Manipulation Benchmarking with a Twist

Figure 4 for RB2: Robotic Manipulation Benchmarking with a Twist

Abstract:Benchmarks offer a scientific way to compare algorithms using objective performance metrics. Good benchmarks have two features: (a) they should be widely useful for many research groups; (b) and they should produce reproducible findings. In robotic manipulation research, there is a trade-off between reproducibility and broad accessibility. If the benchmark is kept restrictive (fixed hardware, objects), the numbers are reproducible but the setup becomes less general. On the other hand, a benchmark could be a loose set of protocols (e.g. object sets) but the underlying variation in setups make the results non-reproducible. In this paper, we re-imagine benchmarking for robotic manipulation as state-of-the-art algorithmic implementations, alongside the usual set of tasks and experimental protocols. The added baseline implementations will provide a way to easily recreate SOTA numbers in a new local robotic setup, thus providing credible relative rankings between existing approaches and new work. However, these local rankings could vary between different setups. To resolve this issue, we build a mechanism for pooling experimental data between labs, and thus we establish a single global ranking for existing (and proposed) SOTA algorithms. Our benchmark, called Ranking-Based Robotics Benchmark (RB2), is evaluated on tasks that are inspired from clinically validated Southampton Hand Assessment Procedures. Our benchmark was run across two different labs and reveals several surprising findings. For example, extremely simple baselines like open-loop behavior cloning, outperform more complicated models (e.g. closed loop, RNN, Offline-RL, etc.) that are preferred by the field. We hope our fellow researchers will use RB2 to improve their research's quality and rigor.

* accepted at the NeurIPS 2021 Datasets and Benchmarks Track

Via

Access Paper or Ask Questions

Self-supervised Transparent Liquid Segmentation for Robotic Pouring

Mar 03, 2022

Gautham Narayan Narasimhan, Kai Zhang, Ben Eisner, Xingyu Lin, David Held

Figure 1 for Self-supervised Transparent Liquid Segmentation for Robotic Pouring

Figure 2 for Self-supervised Transparent Liquid Segmentation for Robotic Pouring

Figure 3 for Self-supervised Transparent Liquid Segmentation for Robotic Pouring

Figure 4 for Self-supervised Transparent Liquid Segmentation for Robotic Pouring

Abstract:Liquid state estimation is important for robotics tasks such as pouring; however, estimating the state of transparent liquids is a challenging problem. We propose a novel segmentation pipeline that can segment transparent liquids such as water from a static, RGB image without requiring any manual annotations or heating of the liquid for training. Instead, we use a generative model that is capable of translating images of colored liquids into synthetically generated transparent liquid images, trained only on an unpaired dataset of colored and transparent liquid images. Segmentation labels of colored liquids are obtained automatically using background subtraction. Our experiments show that we are able to accurately predict a segmentation mask for transparent liquids without requiring any manual annotations. We demonstrate the utility of transparent liquid segmentation in a robotic pouring task that controls pouring by perceiving the liquid height in a transparent cup. Accompanying video and supplementary materials can be found

* 2022 IEEE International Conference on Robotics and Automation (ICRA)
* Accepted at ICRA 2022

Via

Access Paper or Ask Questions

Semi-supervised 3D Object Detection via Temporal Graph Neural Networks

Feb 01, 2022

Jianren Wang, Haiming Gang, Siddarth Ancha, Yi-Ting Chen, David Held

Figure 1 for Semi-supervised 3D Object Detection via Temporal Graph Neural Networks

Figure 2 for Semi-supervised 3D Object Detection via Temporal Graph Neural Networks

Figure 3 for Semi-supervised 3D Object Detection via Temporal Graph Neural Networks

Figure 4 for Semi-supervised 3D Object Detection via Temporal Graph Neural Networks

Abstract:3D object detection plays an important role in autonomous driving and other robotics applications. However, these detectors usually require training on large amounts of annotated data that is expensive and time-consuming to collect. Instead, we propose leveraging large amounts of unlabeled point cloud videos by semi-supervised learning of 3D object detectors via temporal graph neural networks. Our insight is that temporal smoothing can create more accurate detection results on unlabeled data, and these smoothed detections can then be used to retrain the detector. We learn to perform this temporal reasoning with a graph neural network, where edges represent the relationship between candidate detections in different time frames. After semi-supervised learning, our method achieves state-of-the-art detection performance on the challenging nuScenes and H3D benchmarks, compared to baselines trained on the same amount of labeled data. Project and code are released at https://www.jianrenw.com/SOD-TGNN/.

* 3DV 2021

Via

Access Paper or Ask Questions

OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation

Jan 18, 2022

Qiao Gu, Brian Okorn, David Held

Figure 1 for OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation

Figure 2 for OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation

Figure 3 for OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation

Figure 4 for OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation

Abstract:Real-time object pose estimation is necessary for many robot manipulation algorithms. However, state-of-the-art methods for object pose estimation are trained for a specific set of objects; these methods thus need to be retrained to estimate the pose of each new object, often requiring tens of GPU-days of training for optimal performance. \revisef{In this paper, we propose the OSSID framework,} leveraging a slow zero-shot pose estimator to self-supervise the training of a fast detection algorithm. This fast detector can then be used to filter the input to the pose estimator, drastically improving its inference speed. We show that this self-supervised training exceeds the performance of existing zero-shot detection methods on two widely used object pose estimation and detection datasets, without requiring any human annotations. Further, we show that the resulting method for pose estimation has a significantly faster inference speed, due to the ability to filter out large parts of the image. Thus, our method for self-supervised online learning of a detector (trained using pseudo-labels from a slow pose estimator) leads to accurate pose estimation at real-time speeds, without requiring human annotations. Supplementary materials and code can be found at https://georgegu1997.github.io/OSSID/

* 10 pages, 6 figures. Accepted to RA-L

Via

Access Paper or Ask Questions

Self-Supervised Point Cloud Completion via Inpainting

Nov 21, 2021

Himangi Mittal, Brian Okorn, Arpit Jangid, David Held

Figure 1 for Self-Supervised Point Cloud Completion via Inpainting

Figure 2 for Self-Supervised Point Cloud Completion via Inpainting

Figure 3 for Self-Supervised Point Cloud Completion via Inpainting

Figure 4 for Self-Supervised Point Cloud Completion via Inpainting

Abstract:When navigating in urban environments, many of the objects that need to be tracked and avoided are heavily occluded. Planning and tracking using these partial scans can be challenging. The aim of this work is to learn to complete these partial point clouds, giving us a full understanding of the object's geometry using only partial observations. Previous methods achieve this with the help of complete, ground-truth annotations of the target objects, which are available only for simulated datasets. However, such ground truth is unavailable for real-world LiDAR data. In this work, we present a self-supervised point cloud completion algorithm, PointPnCNet, which is trained only on partial scans without assuming access to complete, ground-truth annotations. Our method achieves this via inpainting. We remove a portion of the input data and train the network to complete the missing region. As it is difficult to determine which regions were occluded in the initial cloud and which were synthetically removed, our network learns to complete the full cloud, including the missing regions in the initial partial cloud. We show that our method outperforms previous unsupervised and weakly-supervised methods on both the synthetic dataset, ShapeNet, and real-world LiDAR dataset, Semantic KITTI.

* BMVC 2021 (Oral)

Via

Access Paper or Ask Questions

FabricFlowNet: Bimanual Cloth Manipulation with a Flow-based Policy

Nov 10, 2021

Thomas Weng, Sujay Bajracharya, Yufei Wang, Khush Agrawal, David Held

Figure 1 for FabricFlowNet: Bimanual Cloth Manipulation with a Flow-based Policy

Figure 2 for FabricFlowNet: Bimanual Cloth Manipulation with a Flow-based Policy

Figure 3 for FabricFlowNet: Bimanual Cloth Manipulation with a Flow-based Policy

Figure 4 for FabricFlowNet: Bimanual Cloth Manipulation with a Flow-based Policy

Abstract:We address the problem of goal-directed cloth manipulation, a challenging task due to the deformability of cloth. Our insight is that optical flow, a technique normally used for motion estimation in video, can also provide an effective representation for corresponding cloth poses across observation and goal images. We introduce FabricFlowNet (FFN), a cloth manipulation policy that leverages flow as both an input and as an action representation to improve performance. FabricFlowNet also elegantly switches between bimanual and single-arm actions based on the desired goal. We show that FabricFlowNet significantly outperforms state-of-the-art model-free and model-based cloth manipulation policies that take image input. We also present real-world experiments on a bimanual system, demonstrating effective sim-to-real transfer. Finally, we show that our method generalizes when trained on a single square cloth to other cloth shapes, such as T-shirts and rectangular cloths. Video and other supplementary materials are available at: https://sites.google.com/view/fabricflownet.

* CoRL 2021

Via

Access Paper or Ask Questions

Active Safety Envelopes using Light Curtains with Probabilistic Guarantees

Jul 08, 2021

Siddharth Ancha, Gaurav Pathak, Srinivasa G. Narasimhan, David Held

Figure 1 for Active Safety Envelopes using Light Curtains with Probabilistic Guarantees

Figure 2 for Active Safety Envelopes using Light Curtains with Probabilistic Guarantees

Figure 3 for Active Safety Envelopes using Light Curtains with Probabilistic Guarantees

Figure 4 for Active Safety Envelopes using Light Curtains with Probabilistic Guarantees

Abstract:To safely navigate unknown environments, robots must accurately perceive dynamic obstacles. Instead of directly measuring the scene depth with a LiDAR sensor, we explore the use of a much cheaper and higher resolution sensor: programmable light curtains. Light curtains are controllable depth sensors that sense only along a surface that a user selects. We use light curtains to estimate the safety envelope of a scene: a hypothetical surface that separates the robot from all obstacles. We show that generating light curtains that sense random locations (from a particular distribution) can quickly discover the safety envelope for scenes with unknown objects. Importantly, we produce theoretical safety guarantees on the probability of detecting an obstacle using random curtains. We combine random curtains with a machine learning based model that forecasts and tracks the motion of the safety envelope efficiently. Our method accurately estimates safety envelopes while providing probabilistic safety guarantees that can be used to certify the efficacy of a robot perception system to detect and avoid dynamic obstacles. We evaluate our approach in a simulated urban driving environment and a real-world environment with moving pedestrians using a light curtain device and show that we can estimate safety envelopes efficiently and effectively. Project website: https://siddancha.github.io/projects/active-safety-envelopes-with-guarantees

* 18 pages, Published at Robotics: Science and Systems (RSS) 2021

Via

Access Paper or Ask Questions