Having the ability to estimate an object's properties through interaction will enable robots to manipulate novel objects. Object's dynamics, specifically the friction and inertial parameters have only been estimated in a lab environment with precise and often external sensing. Could we infer an object's dynamics in the wild with only the robot's sensors? In this paper, we explore the estimation of dynamics of a grasped object in motion, with tactile force sensing at multiple fingertips. Our estimation approach does not rely on torque sensing to estimate the dynamics. To estimate friction, we develop a control scheme to actively interact with the object until slip is detected. To robustly perform the inertial estimation, we setup a factor graph that fuses all our sensor measurements on physically consistent manifolds and perform inference. We show that tactile fingertips enable in-hand dynamics estimation of low mass objects.
Ergonomic analysis of human posture plays a vital role in understanding long-term, work-related safety and health. Current analysis is often hindered due to difficulties in estimating human posture. We introduce a new approach to the problem of human posture estimation for teleoperation tasks which relies solely on a haptic-input device for generating observations. We model the human upper body using a redundant, partially observable dynamical system. This allows us to naturally formulate the estimation problem as probabilistic inference and solve the inference problem using a standard particle filter. We show that our approach accurately estimates the posture of different human users without knowing their specific segment lengths. We evaluate our posture estimation approach from a haptic-input device by comparing it with the human posture estimates from a commercial motion capture system. Our results show that the proposed algorithm successfully estimates human posture based only on the trajectory of the haptic-input device stylus. We additionally show that ergonomic risk estimates derived from our posture estimation approach are comparable to those estimates from gold-standard, motion-capture based pose estimates.
We propose a novel approach to multi-fingered grasp planning leveraging learned deep neural network models. We train a voxel-based 3D convolutional neural network to predict grasp success probability as a function of both visual information of an object and grasp configuration. We can then formulate grasp planning as inferring the grasp configuration which maximizes the probability of grasp success. In addition, we learn a prior over grasp configurations as a mixture density network conditioned on our voxel-based object representation. We show that this object conditional prior improves grasp inference when used with the learned grasp success prediction network when compared to a learned, object-agnostic prior, or an uninformed uniform prior. Our work is the first to directly plan high quality multi-fingered grasps in configuration space using a deep neural network without the need of an external planner. We validate our inference method performing multi-finger grasping on a physical robot. Our experimental results show that our planning method outperforms existing grasp planning methods for neural networks.
The purpose of this benchmark is to evaluate the planning and control aspects of robotic in-hand manipulation systems. The goal is to assess the system's ability to change the pose of a hand-held object by either using the fingers, environment or a combination of both. Given an object surface mesh from the YCB data-set, we provide examples of initial and goal states (i.e.\ static object poses and fingertip locations) for various in-hand manipulation tasks. We further propose metrics that measure the error in reaching the goal state from a specific initial state, which, when aggregated across all tasks, also serves as a measure of the system's in-hand manipulation capability. We provide supporting software, task examples, and evaluation results associated with the benchmark. All the supporting material is available at https://robot-learning.cs.utah.edu/project/benchmarking_in_hand_manipulation
A particular type of assistive robots designed for physical interaction with objects could play an important role assisting with mobility and fall prevention in healthcare facilities. Autonomous mobile manipulation presents a hurdle prior to safely using robots in real life applications. In this article, we introduce a mobile manipulation framework based on model predictive control using learned dynamics models of objects. We focus on the specific problem of manipulating legged objects such as those commonly found in healthcare environments and personal dwellings (e.g. walkers, tables, chairs). We describe a probabilistic method for autonomous learning of an approximate dynamics model for these objects. In this method, we learn dynamic parameters using a small dataset consisting of force and motion data from interactions between the robot and object. Moreover, we account for multiple manipulation strategies by formulating the manipulation planning as a mixed-integer convex optimization. The proposed framework considers the hybrid control system comprised of i) choosing which leg to grasp, and ii) control of continuous applied forces for manipulation. We formalize our algorithm based on model predictive control to compensate for modeling errors and find an optimal path to manipulate the object from one configuration to another. We show results for several objects with various wheel configurations. Simulation and physical experiments show that the obtained dynamics models are sufficiently accurate for safe and collision-free manipulation. When combined with the proposed manipulation planning algorithm, the robot successfully moves the object to a desired pose while avoiding collision.
Using simulation to train robot manipulation policies holds the promise of an almost unlimited amount of training data, generated safely out of harm's way. One of the key challenges of using simulation, to date, has been to bridge the reality gap, so that policies trained in simulation can be deployed in the real world. We explore the reality gap in the context of learning a contextual policy for multi-fingered robotic grasping. We propose a Grasping Objects Approach for Tactile (GOAT) robotic hands, learning to overcome the reality gap problem. In our approach we use human hand motion demonstration to initialize and reduce the search space for learning. We contextualize our policy with the bounding cuboid dimensions of the object of interest, which allows the policy to work on a more flexible representation than directly using an image or point cloud. Leveraging fingertip touch sensors in the hand allows the policy to overcome the reduction in geometric information introduced by the coarse bounding box, as well as pose estimation uncertainty. We show our learned policy successfully runs on a real robot without any fine tuning, thus bridging the reality gap.
Deep learning has enabled remarkable improvements in grasp synthesis for previously unseen objects viewed from partial views. However, existing approaches lack the ability to explicitly reason about the full 3D geometry of the object when selecting a grasp, relying on indirect geometric reasoning derived when learning grasp success networks. This abandons common sense geometric reasoning, such as avoiding undesired robot object collisions. We propose to utilize a novel, learned 3D reconstruction to enable geometric awareness in a grasping system. We leverage the structure of the reconstruction network to learn a grasp success classifier which serves as the objective function for a continuous grasp optimization. We additionally explicitly constrain the optimization to avoid undesired contact, directly using the reconstruction. By using the reconstruction network, our method can grasp objects from a new camera viewpoint which was not seen during training. Our results show that utilizing learned geometry outperforms alternative formulations for partial-view information based on real robot execution. Our results can be found on https://sites.google.com/view/reconstruction-grasp/.
We propose a method for sim-to-real robot learning which exploits simulator state information in a way that scales to many objects. First, we train a pair of encoders on raw object pose targets to learn representations that accurately capture the state information of a multi-object environment. Second, we use these encoders in a reinforcement learning algorithm to train image-based policies capable of manipulating many objects. Our pair of encoders consists of one which consumes RGB images and is used in our policy network, and one which directly consumes a set of raw object poses and is used for reward calculation and value estimation. We evaluate our method on the task of pushing a collection of objects to desired tabletop regions. Compared to methods which rely only on images or use fixed-length state encodings, our method achieves higher success rates, performs well in the real world without fine tuning, and generalizes to different numbers and types of objects not seen during training.
The sequence in which a complex product is assembled directly impacts the ease and efficiency of the assembly process, whether executed by a human or a robot. A sequence that gives the assembler the greatest freedom of movement is therefore desirable. Our main contribution is an expression of obstruction relationships between parts as a disassembly interference graph (DIG). We validate this heuristic by developing a disassembly sequence planner that partitions assemblies in a way that prioritizes access to parts, resulting in plans that are comparable in efficiency to two state-of-the-art assembly methods in terms of total plan length. Using DIG, our method generates successive subassembly decompositions, yielding a tree structure that makes parallization opportunities apparent. Our planner generates viable disassembly plans by minimizing our part blockage measure, and thereby demonstrates that this measure is a valuable addition to the Assembly Sequence Planning toolkit.