Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gaurav S. Sukhatme

Generating Behaviorally Diverse Policies with Latent Diffusion Models

May 30, 2023

Shashank Hegde, Sumeet Batra, K. R. Zentner, Gaurav S. Sukhatme

Abstract:Recent progress in Quality Diversity Reinforcement Learning (QD-RL) has enabled learning a collection of behaviorally diverse, high performing policies. However, these methods typically involve storing thousands of policies, which results in high space-complexity and poor scaling to additional behaviors. Condensing the archive into a single model while retaining the performance and coverage of the original collection of policies has proved challenging. In this work, we propose using diffusion models to distill the archive into a single generative model over policy parameters. We show that our method achieves a compression ratio of 13x while recovering 98% of the original rewards and 89% of the original coverage. Further, the conditioning mechanism of diffusion models allows for flexibly selecting and sequencing behaviors, including using language. Project website: https://sites.google.com/view/policydiffusion/home

Via

Access Paper or Ask Questions

IndustReal: Transferring Contact-Rich Assembly Tasks from Simulation to Reality

May 26, 2023

Bingjie Tang, Michael A. Lin, Iretiayo Akinola, Ankur Handa, Gaurav S. Sukhatme, Fabio Ramos, Dieter Fox, Yashraj Narang

Abstract:Robotic assembly is a longstanding challenge, requiring contact-rich interaction and high precision and accuracy. Many applications also require adaptivity to diverse parts, poses, and environments, as well as low cycle times. In other areas of robotics, simulation is a powerful tool to develop algorithms, generate datasets, and train agents. However, simulation has had a more limited impact on assembly. We present IndustReal, a set of algorithms, systems, and tools that solve assembly tasks in simulation with reinforcement learning (RL) and successfully achieve policy transfer to the real world. Specifically, we propose 1) simulation-aware policy updates, 2) signed-distance-field rewards, and 3) sampling-based curricula for robotic RL agents. We use these algorithms to enable robots to solve contact-rich pick, place, and insertion tasks in simulation. We then propose 4) a policy-level action integrator to minimize error at policy deployment time. We build and demonstrate a real-world robotic assembly system that uses the trained policies and action integrator to achieve repeatable performance in the real world. Finally, we present hardware and software tools that allow other researchers to reproduce our system and results. For videos and additional details, please see http://sites.google.com/nvidia.com/industreal .

* Accepted to Robotics: Science and Systems (RSS) 2023

Via

Access Paper or Ask Questions

Reducing Network Load via Message Utility Estimation for Decentralized Multirobot Teams

Apr 14, 2023

Isabel M. Rayas Fernández, Christopher E. Denniston, Gaurav S. Sukhatme

Abstract:We are motivated by quantile estimation of algae concentration in lakes. We find that multirobot teams improve performance in this task over single robots, and communication-enabled teams further over communication-deprived teams; however, real robots are resource-constrained, and communication networks cannot support arbitrary message loads, making na\"ive, constant information-sharing but also complex modeling and decision-making infeasible. With this in mind, we propose online, locally computable metrics for determining the utility of transmitting a given message to the other team members and a decision-theoretic approach that chooses to transmit only the most useful messages, using a decentralized and independent framework for maintaining beliefs of other teammates. We validate our approach in simulation on a real-world aquatic dataset, and show that restricting communication via a utility estimation method based on the expected impact of a message on future teammate behavior results in a 44% decrease in network load while increasing quantile estimation error by only 2.16%.

* 4 pages, 1 table, 3 figures

Via

Access Paper or Ask Questions

Learned Parameter Selection for Robotic Information Gathering

Mar 09, 2023

Christopher E. Denniston, Gautam Salhotra, Akseli Kangaslahti, David A. Caron, Gaurav S. Sukhatme

Abstract:When robots are deployed in the field for environmental monitoring they typically execute pre-programmed motions, such as lawnmower paths, instead of adaptive methods, such as informative path planning. One reason for this is that adaptive methods are dependent on parameter choices that are both critical to set correctly and difficult for the non-specialist to choose. Here, we show how to automatically configure a planner for informative path planning by training a reinforcement learning agent to select planner parameters at each iteration of informative path planning. We demonstrate our method with 37 instances of 3 distinct environments, and compare it against pure (end-to-end) reinforcement learning techniques, as well as approaches that do not use a learned model to change the planner parameters. Our method shows a 9.53% mean improvement in the cumulative reward across diverse environments when compared to end-to-end learning based methods; we also demonstrate via a field experiment how it can be readily used to facilitate high performance deployment of an information gathering robot.

* 8 pages, Submitted to IROS 2023

Via

Access Paper or Ask Questions

A Study on Multirobot Quantile Estimation in Natural Environments

Mar 06, 2023

Isabel M. Rayas Fernández, Christopher E. Denniston, Gaurav S. Sukhatme

Abstract:Quantiles of a natural phenomena can provide scientists with an important understanding of typical, extreme, or other spreads of concentrations. When a group has several available robots, or teams of scientists come together to study a particular environment, it may be advantageous to pool robot resources in a collaborative way to improve performance. A multirobot team can be difficult to practically bring together and coordinate, especially when robot communication is involved. To this end, we present a study across several axes of the impact of using multiple robots to estimate quantiles of a distribution of interest using an informative path planning formulation. We measure quantile estimation accuracy with increasing team size to understand what benefits result from a multirobot approach in a drone exploration task of analyzing the algae concentration in lakes. We additionally perform an analysis on several parameters, including the spread of robot initial positions, the planning budget, and inter-robot communication, and find that while using more robots generally results in lower estimation error, this benefit is achieved under certain conditions. We present our findings in the context of real field robotic applications and discuss the implications of the results and interesting directions for future work.

* 8 pages, 9 figures

Via

Access Paper or Ask Questions

Probabilistic Trajectory Planning for Static and Interaction-aware Dynamic Obstacle Avoidance

Feb 24, 2023

Baskın Şenbaşlar, Gaurav S. Sukhatme

Abstract:Collision-free mobile robot navigation is an important problem for many robotics applications, especially in cluttered environments. In such environments, obstacles can be static or dynamic. Dynamic obstacles can additionally be interactive, i.e. changing their behavior according to the behavior of other entities. The perception and prediction modules of robotic systems create probabilistic representations and predictions of such environments. In this paper, we propose a novel prediction representation for interactive behaviors of dynamic obstacles. Then, we propose a real-time trajectory planning algorithm that probabilistically avoids collisions against static and interactive dynamic obstacles, and produces dynamically feasible trajectories. During decision making, our planner simulates the interactive behavior of dynamic obstacles in response to the actions planning robot takes. We explicitly minimize collision probabilities against static and dynamic obstacles using a multi-objective search formulation. Then, we formulate a quadratic program to safely fit a smooth trajectory to the search result while attempting to preserve the collision probabilities computed during search. We evaluate our algorithm extensively in simulations to show its performance under different environments and configurations using 78000 randomly generated cases. We compare its performance to a state-of-the-art trajectory planning algorithm for static and dynamic obstacle avoidance using 4500 randomly generated cases. We show that our algorithm achieves up to 3.8x success rate using as low as 0.18x time the baseline uses. We implement our algorithm for physical quadrotors, and show its feasibility in the real world.

* 22 pages

Via

Access Paper or Ask Questions

RREx-BoT: Remote Referring Expressions with a Bag of Tricks

Jan 30, 2023

Gunnar A. Sigurdsson, Jesse Thomason, Gaurav S. Sukhatme, Robinson Piramuthu

Abstract:Household robots operate in the same space for years. Such robots incrementally build dynamic maps that can be used for tasks requiring remote object localization. However, benchmarks in robot learning often test generalization through inference on tasks in unobserved environments. In an observed environment, locating an object is reduced to choosing from among all object proposals in the environment, which may number in the 100,000s. Armed with this intuition, using only a generic vision-language scoring model with minor modifications for 3d encoding and operating in an embodied environment, we demonstrate an absolute performance gain of 9.84% on remote object grounding above state of the art models for REVERIE and of 5.04% on FAO. When allowed to pre-explore an environment, we also exceed the previous state of the art pre-exploration method on REVERIE. Additionally, we demonstrate our model on a real-world TurtleBot platform, highlighting the simplicity and usefulness of the approach. Our analysis outlines a "bag of tricks" essential for accomplishing this task, from utilizing 3d coordinates and context, to generalizing vision-language models to large 3d search spaces.

Via

Access Paper or Ask Questions

Fast and Scalable Signal Inference for Active Robotic Source Seeking

Jan 06, 2023

Christopher E. Denniston, Oriana Peltzer, Joshua Ott, Sangwoo Moon, Sung-Kyun Kim, Gaurav S. Sukhatme, Mykel J. Kochenderfer, Mac Schwager, Ali-akbar Agha-mohammadi

Figure 1 for Fast and Scalable Signal Inference for Active Robotic Source Seeking

Figure 2 for Fast and Scalable Signal Inference for Active Robotic Source Seeking

Figure 3 for Fast and Scalable Signal Inference for Active Robotic Source Seeking

Figure 4 for Fast and Scalable Signal Inference for Active Robotic Source Seeking

Abstract:In active source seeking, a robot takes repeated measurements in order to locate a signal source in a cluttered and unknown environment. A key component of an active source seeking robot planner is a model that can produce estimates of the signal at unknown locations with uncertainty quantification. This model allows the robot to plan for future measurements in the environment. Traditionally, this model has been in the form of a Gaussian process, which has difficulty scaling and cannot represent obstacles. %In this work, We propose a global and local factor graph model for active source seeking, which allows the model to scale to a large number of measurements and represent unknown obstacles in the environment. We combine this model with extensions to a highly scalable planner to form a system for large-scale active source seeking. We demonstrate that our approach outperforms baseline methods in both simulated and real robot experiments.

* 6 pages, Submitted to ICRA 2023

Via

Access Paper or Ask Questions

OpenD: A Benchmark for Language-Driven Door and Drawer Opening

Dec 10, 2022

Yizhou Zhao, Qiaozi Gao, Liang Qiu, Govind Thattai, Gaurav S. Sukhatme

Figure 1 for OpenD: A Benchmark for Language-Driven Door and Drawer Opening

Figure 2 for OpenD: A Benchmark for Language-Driven Door and Drawer Opening

Figure 3 for OpenD: A Benchmark for Language-Driven Door and Drawer Opening

Figure 4 for OpenD: A Benchmark for Language-Driven Door and Drawer Opening

Abstract:We introduce OPEND, a benchmark for learning how to use a hand to open cabinet doors or drawers in a photo-realistic and physics-reliable simulation environment driven by language instruction. To solve the task, we propose a multi-step planner composed of a deep neural network and rule-base controllers. The network is utilized to capture spatial relationships from images and understand semantic meaning from language instructions. Controllers efficiently execute the plan based on the spatial and semantic understanding. We evaluate our system by measuring its zero-shot performance in test data set. Experimental results demonstrate the effectiveness of decision planning by our multi-step planner for different hands, while suggesting that there is significant room for developing better models to address the challenge brought by language understanding, spatial reasoning, and long-term manipulation. We will release OPEND and host challenges to promote future research in this area.

Via

Access Paper or Ask Questions

CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation

Nov 30, 2022

Vishnu Sashank Dorbala, Gunnar Sigurdsson, Robinson Piramuthu, Jesse Thomason, Gaurav S. Sukhatme

Abstract:Household environments are visually diverse. Embodied agents performing Vision-and-Language Navigation (VLN) in the wild must be able to handle this diversity, while also following arbitrary language instructions. Recently, Vision-Language models like CLIP have shown great performance on the task of zero-shot object recognition. In this work, we ask if these models are also capable of zero-shot language grounding. In particular, we utilize CLIP to tackle the novel problem of zero-shot VLN using natural language referring expressions that describe target objects, in contrast to past work that used simple language templates describing object classes. We examine CLIP's capability in making sequential navigational decisions without any dataset-specific finetuning, and study how it influences the path that an agent takes. Our results on the coarse-grained instruction following task of REVERIE demonstrate the navigational capability of CLIP, surpassing the supervised baseline in terms of both success rate (SR) and success weighted by path length (SPL). More importantly, we quantitatively show that our CLIP-based zero-shot approach generalizes better to show consistent performance across environments when compared to SOTA, fully supervised learning approaches when evaluated via Relative Change in Success (RCS).

* 8 pages, Accepted at LangRob Workshop at Conference on Robot Learning (CoRL), 2022

Via

Access Paper or Ask Questions