Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gaurav S. Sukhatme

Efficiently Learning Small Policies for Locomotion and Manipulation

Sep 30, 2022

Shashank Hegde, Gaurav S. Sukhatme

Figure 1 for Efficiently Learning Small Policies for Locomotion and Manipulation

Figure 2 for Efficiently Learning Small Policies for Locomotion and Manipulation

Figure 3 for Efficiently Learning Small Policies for Locomotion and Manipulation

Figure 4 for Efficiently Learning Small Policies for Locomotion and Manipulation

Abstract:Neural control of memory-constrained, agile robots requires small, yet highly performant models. We leverage graph hyper networks to learn graph hyper policies trained with off-policy reinforcement learning resulting in networks that are two orders of magnitude smaller than commonly used networks yet encode policies comparable to those encoded by much larger networks trained on the same task. We show that our method can be appended to any off-policy reinforcement learning algorithm, without any change in hyperparameters, by showing results across locomotion and manipulation tasks. Further, we obtain an array of working policies, with differing numbers of parameters, allowing us to pick an optimal network for the memory constraints of a system. Training multiple policies with our method is as sample efficient as training a single policy. Finally, we provide a method to select the best architecture, given a constraint on the number of parameters. Project website: https://sites.google.com/usc.edu/graphhyperpolicy

Via

Access Paper or Ask Questions

CH-MARL: A Multimodal Benchmark for Cooperative, Heterogeneous Multi-Agent Reinforcement Learning

Aug 26, 2022

Vasu Sharma, Prasoon Goyal, Kaixiang Lin, Govind Thattai, Qiaozi Gao, Gaurav S. Sukhatme

Figure 1 for CH-MARL: A Multimodal Benchmark for Cooperative, Heterogeneous Multi-Agent Reinforcement Learning

Figure 2 for CH-MARL: A Multimodal Benchmark for Cooperative, Heterogeneous Multi-Agent Reinforcement Learning

Figure 3 for CH-MARL: A Multimodal Benchmark for Cooperative, Heterogeneous Multi-Agent Reinforcement Learning

Figure 4 for CH-MARL: A Multimodal Benchmark for Cooperative, Heterogeneous Multi-Agent Reinforcement Learning

Abstract:We propose a multimodal (vision-and-language) benchmark for cooperative and heterogeneous multi-agent learning. We introduce a benchmark multimodal dataset with tasks involving collaboration between multiple simulated heterogeneous robots in a rich multi-room home environment. We provide an integrated learning framework, multimodal implementations of state-of-the-art multi-agent reinforcement learning techniques, and a consistent evaluation protocol. Our experiments investigate the impact of different modalities on multi-agent learning performance. We also introduce a simple message passing method between agents. The results suggest that multimodality introduces unique challenges for cooperative multi-agent learning and there is significant room for advancing multi-agent reinforcement learning methods in such settings.

Via

Access Paper or Ask Questions

Decentralized Risk-Aware Tracking of Multiple Targets

Aug 04, 2022

Jiazhen Liu, Lifeng Zhou, Ragesh Ramachandran, Gaurav S. Sukhatme, Vijay Kumar

Figure 1 for Decentralized Risk-Aware Tracking of Multiple Targets

Figure 2 for Decentralized Risk-Aware Tracking of Multiple Targets

Figure 3 for Decentralized Risk-Aware Tracking of Multiple Targets

Figure 4 for Decentralized Risk-Aware Tracking of Multiple Targets

Abstract:We consider the setting where a team of robots is tasked with tracking multiple targets with the following property: approaching the targets enables more accurate target position estimation, but also increases the risk of sensor failures. Therefore, it is essential to address the trade-off between tracking quality maximization and risk minimization. In our previous work, a centralized controller is developed to plan motions for all the robots -- however, this is not a scalable approach. Here, we present a decentralized and risk-aware multi-target tracking framework, in which each robot plans its motion trading off tracking accuracy maximization and aversion to risk, while only relying on its own information and information exchanged with its neighbors. We use the control barrier function to guarantee network connectivity throughout the tracking process. Extensive numerical experiments demonstrate that our system can achieve similar tracking accuracy and risk-awareness to its centralized counterpart.

* DARS2022 submission preprint

Via

Access Paper or Ask Questions

Learning Deformable Object Manipulation from Expert Demonstrations

Jul 20, 2022

Gautam Salhotra, I-Chun Arthur Liu, Marcus Dominguez-Kuhne, Gaurav S. Sukhatme

Figure 1 for Learning Deformable Object Manipulation from Expert Demonstrations

Figure 2 for Learning Deformable Object Manipulation from Expert Demonstrations

Figure 3 for Learning Deformable Object Manipulation from Expert Demonstrations

Figure 4 for Learning Deformable Object Manipulation from Expert Demonstrations

Abstract:We present a novel Learning from Demonstration (LfD) method, Deformable Manipulation from Demonstrations (DMfD), to solve deformable manipulation tasks using states or images as inputs, given expert demonstrations. Our method uses demonstrations in three different ways, and balances the trade-off between exploring the environment online and using guidance from experts to explore high dimensional spaces effectively. We test DMfD on a set of representative manipulation tasks for a 1-dimensional rope and a 2-dimensional cloth from the SoftGym suite of tasks, each with state and image observations. Our method exceeds baseline performance by up to 12.9% for state-based tasks and up to 33.44% on image-based tasks, with comparable or better robustness to randomness. Additionally, we create two challenging environments for folding a 2D cloth using image-based observations, and set a performance benchmark for them. We deploy DMfD on a real robot with a minimal loss in normalized performance during real-world execution compared to simulation (~6%). Source code is on github.com/uscresl/dmfd

* IEEE Robotics & Automation Letters (RA-L) Oct 2022
* Accepted to IEEE Robotics & Automation Letters (RA-L) and IEEE IROS 2022. Project website: https://uscresl.github.io/dmfd

Via

Access Paper or Ask Questions

A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

Jun 21, 2022

Brandon Trabucco, Gunnar Sigurdsson, Robinson Piramuthu, Gaurav S. Sukhatme, Ruslan Salakhutdinov

Figure 1 for A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

Figure 2 for A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

Figure 3 for A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

Figure 4 for A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

Abstract:Physically rearranging objects is an important capability for embodied agents. Visual room rearrangement evaluates an agent's ability to rearrange objects in a room to a desired goal based solely on visual input. We propose a simple yet effective method for this problem: (1) search for and map which objects need to be rearranged, and (2) rearrange each object until the task is complete. Our approach consists of an off-the-shelf semantic segmentation model, voxel-based semantic map, and semantic search policy to efficiently find objects that need to be rearranged. On the AI2-THOR Rearrangement Challenge, our method improves on current state-of-the-art end-to-end reinforcement learning-based methods that learn visual rearrangement policies from 0.53% correct rearrangement to 16.56%, using only 2.7% as many samples from the environment.

* Winner of the Rearrangement Challenge at CVPR 2022

Via

Access Paper or Ask Questions

Loop Closure Prioritization for Efficient and Scalable Multi-Robot SLAM

May 24, 2022

Christopher E. Denniston, Yun Chang, Andrzej Reinke, Kamak Ebadi, Gaurav S. Sukhatme, Luca Carlone, Benjamin Morrell, Ali-akbar Agha-mohammadi

Figure 1 for Loop Closure Prioritization for Efficient and Scalable Multi-Robot SLAM

Figure 2 for Loop Closure Prioritization for Efficient and Scalable Multi-Robot SLAM

Figure 3 for Loop Closure Prioritization for Efficient and Scalable Multi-Robot SLAM

Figure 4 for Loop Closure Prioritization for Efficient and Scalable Multi-Robot SLAM

Abstract:Multi-robot SLAM systems in GPS-denied environments require loop closures to maintain a drift-free centralized map. With an increasing number of robots and size of the environment, checking and computing the transformation for all the loop closure candidates becomes computationally infeasible. In this work, we describe a loop closure module that is able to prioritize which loop closures to compute based on the underlying pose graph, the proximity to known beacons, and the characteristics of the point clouds. We validate this system in the context of the DARPA Subterranean Challenge and on numerous challenging underground datasets and demonstrate the ability of this system to generate and maintain a map with low error. We find that our proposed techniques are able to select effective loop closures which results in 51% mean reduction in median error when compared to an odometric solution and 75% mean reduction in median error when compared to a baseline version of this system with no prioritization. We also find our proposed system is able to find a lower error in the mission time of one hour when compared to a system that processes every possible loop closure in four and a half hours.

Via

Access Paper or Ask Questions

Inferring Articulated Rigid Body Dynamics from RGBD Video

Mar 20, 2022

Eric Heiden, Ziang Liu, Vibhav Vineet, Erwin Coumans, Gaurav S. Sukhatme

Figure 1 for Inferring Articulated Rigid Body Dynamics from RGBD Video

Figure 2 for Inferring Articulated Rigid Body Dynamics from RGBD Video

Figure 3 for Inferring Articulated Rigid Body Dynamics from RGBD Video

Figure 4 for Inferring Articulated Rigid Body Dynamics from RGBD Video

Abstract:Being able to reproduce physical phenomena ranging from light interaction to contact mechanics, simulators are becoming increasingly useful in more and more application domains where real-world interaction or labeled data are difficult to obtain. Despite recent progress, significant human effort is needed to configure simulators to accurately reproduce real-world behavior. We introduce a pipeline that combines inverse rendering with differentiable simulation to create digital twins of real-world articulated mechanisms from depth or RGB videos. Our approach automatically discovers joint types and estimates their kinematic parameters, while the dynamic properties of the overall mechanism are tuned to attain physically accurate simulations. Control policies optimized in our derived simulation transfer successfully back to the original system, as we demonstrate on a simulated system. Further, our approach accurately reconstructs the kinematic tree of an articulated mechanism being manipulated by a robot, and highly nonlinear dynamics of a real-world coupled pendulum mechanism. Website: https://eric-heiden.github.io/video2sim

* Submitted to IROS 2022

Via

Access Paper or Ask Questions

DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following

Feb 27, 2022

Xiaofeng Gao, Qiaozi Gao, Ran Gong, Kaixiang Lin, Govind Thattai, Gaurav S. Sukhatme

Figure 1 for DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following

Figure 2 for DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following

Figure 3 for DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following

Figure 4 for DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following

Abstract:Language-guided Embodied AI benchmarks requiring an agent to navigate an environment and manipulate objects typically allow one-way communication: the human user gives a natural language command to the agent, and the agent can only follow the command passively. We present DialFRED, a dialogue-enabled embodied instruction following benchmark based on the ALFRED benchmark. DialFRED allows an agent to actively ask questions to the human user; the additional information in the user's response is used by the agent to better complete its task. We release a human-annotated dataset with 53K task-relevant questions and answers and an oracle to answer questions. To solve DialFRED, we propose a questioner-performer framework wherein the questioner is pre-trained with the human-annotated data and fine-tuned with reinforcement learning. We make DialFRED publicly available and encourage researchers to propose and evaluate their solutions to building dialog-enabled embodied agents.

* 8 pages, 5 figures, under review

Via

Access Paper or Ask Questions

Privacy Preserving Visual Question Answering

Feb 15, 2022

Cristian-Paul Bara, Qing Ping, Abhinav Mathur, Govind Thattai, Rohith MV, Gaurav S. Sukhatme

Figure 1 for Privacy Preserving Visual Question Answering

Figure 2 for Privacy Preserving Visual Question Answering

Figure 3 for Privacy Preserving Visual Question Answering

Figure 4 for Privacy Preserving Visual Question Answering

Abstract:We introduce a novel privacy-preserving methodology for performing Visual Question Answering on the edge. Our method constructs a symbolic representation of the visual scene, using a low-complexity computer vision model that jointly predicts classes, attributes and predicates. This symbolic representation is non-differentiable, which means it cannot be used to recover the original image, thereby keeping the original image private. Our proposed hybrid solution uses a vision model which is more than 25 times smaller than the current state-of-the-art (SOTA) vision models, and 100 times smaller than end-to-end SOTA VQA models. We report detailed error analysis and discuss the trade-offs of using a distilled vision model and a symbolic representation of the visual scene.

Via

Access Paper or Ask Questions

Adaptive Sampling to Estimate Quantiles for Guiding Physical Sampling

Jan 25, 2022

Isabel M. Rayas Fernández, Christopher E. Denniston, David A. Caron, Gaurav S. Sukhatme

Figure 1 for Adaptive Sampling to Estimate Quantiles for Guiding Physical Sampling

Figure 2 for Adaptive Sampling to Estimate Quantiles for Guiding Physical Sampling

Figure 3 for Adaptive Sampling to Estimate Quantiles for Guiding Physical Sampling

Figure 4 for Adaptive Sampling to Estimate Quantiles for Guiding Physical Sampling

Abstract:Scientists interested in studying natural phenomena often take physical samples for later analysis at locations specified by expert heuristics. Instead, we propose to guide scientists' physical sampling by using a robot to perform an adaptive sampling survey to find locations to suggest that correspond to the quantile values of pre-specified quantiles of interest. We develop a robot planner using novel objective functions to improve the estimates of the quantile values over time and an approach to find locations which correspond to the quantile values. We demonstrate our approach on two different sampling tasks in simulation using previously collected aquatic data and validate it in a field trial. Our approach outperforms objectives that maximize spatial coverage or find extrema in planning and is able to localize the quantile spatial locations.

* 7 pages, 8 figures

Via

Access Paper or Ask Questions