Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

George Konidaris

MIT

Automatic Encoding and Repair of Reactive High-Level Tasks with Learned Abstract Representations

Apr 18, 2022

Adam Pacheck, Steven James, George Konidaris, Hadas Kress-Gazit

Figure 1 for Automatic Encoding and Repair of Reactive High-Level Tasks with Learned Abstract Representations

Figure 2 for Automatic Encoding and Repair of Reactive High-Level Tasks with Learned Abstract Representations

Figure 3 for Automatic Encoding and Repair of Reactive High-Level Tasks with Learned Abstract Representations

Figure 4 for Automatic Encoding and Repair of Reactive High-Level Tasks with Learned Abstract Representations

Abstract:We present a framework that, given a set of skills a robot can perform, abstracts sensor data into symbols that we use to automatically encode the robot's capabilities in Linear Temporal Logic. We specify reactive high-level tasks based on these capabilities, for which a strategy is automatically synthesized and executed on the robot, if the task is feasible. If a task is not feasible given the robot's capabilities, we present two methods, one enumeration-based and one synthesis-based, for automatically suggesting additional skills for the robot or modifications to existing skills that would make the task feasible. We demonstrate our framework on a Baxter robot manipulating blocks on a table, a Baxter robot manipulating plates on a table, and a Kinova arm manipulating vials, with multiple sensor modalities, including raw images.

* 27 pages, 15 figures, Submitted to The International Journal of Robotics Research (IJRR)

Via

Access Paper or Ask Questions

Hierarchical Reinforcement Learning of Locomotion Policies in Response to Approaching Objects: A Preliminary Study

Mar 20, 2022

Shangqun Yu, Sreehari Rammohan, Kaiyu Zheng, George Konidaris

Figure 1 for Hierarchical Reinforcement Learning of Locomotion Policies in Response to Approaching Objects: A Preliminary Study

Figure 2 for Hierarchical Reinforcement Learning of Locomotion Policies in Response to Approaching Objects: A Preliminary Study

Figure 3 for Hierarchical Reinforcement Learning of Locomotion Policies in Response to Approaching Objects: A Preliminary Study

Abstract:Animals such as rabbits and birds can instantly generate locomotion behavior in reaction to a dynamic, approaching object, such as a person or a rock, despite having possibly never seen the object before and having limited perception of the object's properties. Recently, deep reinforcement learning has enabled complex kinematic systems such as humanoid robots to successfully move from point A to point B. Inspired by the observation of the innate reactive behavior of animals in nature, we hope to extend this progress in robot locomotion to settings where external, dynamic objects are involved whose properties are partially observable to the robot. As a first step toward this goal, we build a simulation environment in MuJoCo where a legged robot must avoid getting hit by a ball moving toward it. We explore whether prior locomotion experiences that animals typically possess benefit the learning of a reactive control policy under a proposed hierarchical reinforcement learning framework. Preliminary results support the claim that the learning becomes more efficient using this hierarchical reinforcement learning method, even when partial observability (radius-based object visibility) is taken into account.

* RLDM 2022

Via

Access Paper or Ask Questions

IKFlow: Generating Diverse Inverse Kinematics Solutions

Nov 17, 2021

Barrett Ames, Jeremy Morgan, George Konidaris

Figure 1 for IKFlow: Generating Diverse Inverse Kinematics Solutions

Figure 2 for IKFlow: Generating Diverse Inverse Kinematics Solutions

Figure 3 for IKFlow: Generating Diverse Inverse Kinematics Solutions

Figure 4 for IKFlow: Generating Diverse Inverse Kinematics Solutions

Abstract:Inverse kinematics - finding joint poses that reach a given Cartesian-space end-effector pose - is a common operation in robotics, since goals and waypoints are typically defined in Cartesian space, but robots must be controlled in joint space. However, existing inverse kinematics solvers return a single solution pose, where systems with more than 6 degrees of freedom support infinitely many such solutions, which can be useful in the presence of constraints, pose preferences, or obstacles. We introduce a method that uses a deep neural network to learn to generate a diverse set of samples from the solution space of such kinematic chains. The resulting samples can be generated quickly (2000 solutions in under 10ms) and accurately (to within 10 millimeters and 2 degrees of an exact solution) and can be rapidly refined by classical methods if necessary.

* 7 pages, 4 figures

Via

Access Paper or Ask Questions

Guided Policy Search for Parameterized Skills using Adverbs

Oct 23, 2021

Benjamin A. Spiegel, George Konidaris

Figure 1 for Guided Policy Search for Parameterized Skills using Adverbs

Figure 2 for Guided Policy Search for Parameterized Skills using Adverbs

Figure 3 for Guided Policy Search for Parameterized Skills using Adverbs

Figure 4 for Guided Policy Search for Parameterized Skills using Adverbs

Abstract:We present a method for using adverb phrases to adjust skill parameters via learned adverb-skill groundings. These groundings allow an agent to use adverb feedback provided by a human to directly update a skill policy, in a manner similar to traditional local policy search methods. We show that our method can be used as a drop-in replacement for these policy search methods when dense reward from the environment is not available but human language feedback is. We demonstrate improved sample efficiency over modern policy search methods in two experiments.

* 7 pages, 3 figures

Via

Access Paper or Ask Questions

Coarse-Grained Smoothness for RL in Metric Spaces

Oct 23, 2021

Omer Gottesman, Kavosh Asadi, Cameron Allen, Sam Lobel, George Konidaris, Michael Littman

Figure 1 for Coarse-Grained Smoothness for RL in Metric Spaces

Figure 2 for Coarse-Grained Smoothness for RL in Metric Spaces

Figure 3 for Coarse-Grained Smoothness for RL in Metric Spaces

Figure 4 for Coarse-Grained Smoothness for RL in Metric Spaces

Abstract:Principled decision-making in continuous state--action spaces is impossible without some assumptions. A common approach is to assume Lipschitz continuity of the Q-function. We show that, unfortunately, this property fails to hold in many typical domains. We propose a new coarse-grained smoothness definition that generalizes the notion of Lipschitz continuity, is more widely applicable, and allows us to compute significantly tighter bounds on Q-functions, leading to improved learning. We provide a theoretical analysis of our new smoothness definition, and discuss its implications and impact on control and exploration in continuous domains.

Via

Access Paper or Ask Questions

Towards Optimal Correlational Object Search

Oct 19, 2021

Kaiyu Zheng, Rohan Chitnis, Yoonchang Sung, George Konidaris, Stefanie Tellex

Figure 1 for Towards Optimal Correlational Object Search

Figure 2 for Towards Optimal Correlational Object Search

Figure 3 for Towards Optimal Correlational Object Search

Figure 4 for Towards Optimal Correlational Object Search

Abstract:In realistic applications of object search, robots will need to locate target objects in complex environments while coping with unreliable sensors, especially for small or hard-to-detect objects. In such settings, correlational information can be valuable for planning efficiently: when looking for a fork, the robot could start by locating the easier-to-detect refrigerator, since forks would probably be found nearby. Previous approaches to object search with correlational information typically resort to ad-hoc or greedy search strategies. In this paper, we propose the Correlational Object Search POMDP (COS-POMDP), which can be solved to produce search strategies that use correlational information. COS-POMDPs contain a correlation-based observation model that allows us to avoid the exponential blow-up of maintaining a joint belief about all objects, while preserving the optimal solution to this naive, exponential POMDP formulation. We propose a hierarchical planning algorithm to scale up COS-POMDP for practical domains. We conduct experiments using AI2-THOR, a realistic simulator of household environments, as well as YOLOv5, a widely-used object detector. Our results show that, particularly for hard-to-detect objects, such as scrub brush and remote control, our method offers the most robust performance compared to baselines that ignore correlations as well as a greedy, next-best view approach.

* 10 pages, 4 figures, 3 tables

Via

Access Paper or Ask Questions

Learning to Infer Kinematic Hierarchies for Novel Object Instances

Oct 15, 2021

Hameed Abdul-Rashid, Miles Freeman, Ben Abbatematteo, George Konidaris, Daniel Ritchie

Figure 1 for Learning to Infer Kinematic Hierarchies for Novel Object Instances

Figure 2 for Learning to Infer Kinematic Hierarchies for Novel Object Instances

Figure 3 for Learning to Infer Kinematic Hierarchies for Novel Object Instances

Figure 4 for Learning to Infer Kinematic Hierarchies for Novel Object Instances

Abstract:Manipulating an articulated object requires perceiving itskinematic hierarchy: its parts, how each can move, and howthose motions are coupled. Previous work has explored per-ception for kinematics, but none infers a complete kinematichierarchy on never-before-seen object instances, without relyingon a schema or template. We present a novel perception systemthat achieves this goal. Our system infers the moving parts ofan object and the kinematic couplings that relate them. Toinfer parts, it uses a point cloud instance segmentation neuralnetwork and to infer kinematic hierarchies, it uses a graphneural network to predict the existence, direction, and typeof edges (i.e. joints) that relate the inferred parts. We trainthese networks using simulated scans of synthetic 3D models.We evaluate our system on simulated scans of 3D objects, andwe demonstrate a proof-of-concept use of our system to drivereal-world robotic manipulation.

Via

Access Paper or Ask Questions

Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Oct 11, 2021

Eric Hsiung, Hiloni Mehta, Junchi Chu, Xinyu Liu, Roma Patel, Stefanie Tellex, George Konidaris

Figure 1 for Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Figure 2 for Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Figure 3 for Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Figure 4 for Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Abstract:Recent work on using natural language to specify commands to robots has grounded that language to LTL. However, mapping natural language task specifications to LTL task specifications using language models require probability distributions over finite vocabulary. Existing state-of-the-art methods have extended this finite vocabulary to include unseen terms from the input sequence to improve output generalization. However, novel out-of-vocabulary atomic propositions cannot be generated using these methods. To overcome this, we introduce an intermediate contextual query representation which can be learned from single positive task specification examples, associating a contextual query with an LTL template. We demonstrate that this intermediate representation allows for generalization over unseen object references, assuming accurate groundings are available. We compare our method of mapping natural language task specifications to intermediate contextual queries against state-of-the-art CopyNet models capable of translating natural language to LTL, by evaluating whether correct LTL for manipulation and navigation task specifications can be output, and show that our method outperforms the CopyNet model on unseen object references. We demonstrate that the grounded LTL our method outputs can be used for planning in a simulated OO-MDP environment. Finally, we discuss some common failure modes encountered when translating natural language task specifications to grounded LTL.

* 7 pages (6 + 1 references page), 3 figures, 2 tables. Submitted to ICRA 2022

Via

Access Paper or Ask Questions

RMPs for Safe Impedance Control in Contact-Rich Manipulation

Sep 24, 2021

Seiji Shaw, Ben Abbatematteo, George Konidaris

Figure 1 for RMPs for Safe Impedance Control in Contact-Rich Manipulation

Figure 2 for RMPs for Safe Impedance Control in Contact-Rich Manipulation

Figure 3 for RMPs for Safe Impedance Control in Contact-Rich Manipulation

Figure 4 for RMPs for Safe Impedance Control in Contact-Rich Manipulation

Abstract:Variable impedance control in operation-space is a promising approach to learning contact-rich manipulation behaviors. One of the main challenges with this approach is producing a manipulation behavior that ensures the safety of the arm and the environment. Such behavior is typically implemented via a reward function that penalizes unsafe actions (e.g. obstacle collision, joint limit extension), but that approach is not always effective and does not result in behaviors that can be reused in slightly different environments. We show how to combine Riemannian Motion Policies, a class of policies that dynamically generate motion in the presence of safety and collision constraints, with variable impedance operation-space control to learn safer contact-rich manipulation behaviors.

Via

Access Paper or Ask Questions

HAC Explore: Accelerating Exploration with Hierarchical Reinforcement Learning

Aug 12, 2021

Willie McClinton, Andrew Levy, George Konidaris

Figure 1 for HAC Explore: Accelerating Exploration with Hierarchical Reinforcement Learning

Figure 2 for HAC Explore: Accelerating Exploration with Hierarchical Reinforcement Learning

Figure 3 for HAC Explore: Accelerating Exploration with Hierarchical Reinforcement Learning

Figure 4 for HAC Explore: Accelerating Exploration with Hierarchical Reinforcement Learning

Abstract:Sparse rewards and long time horizons remain challenging for reinforcement learning algorithms. Exploration bonuses can help in sparse reward settings by encouraging agents to explore the state space, while hierarchical approaches can assist with long-horizon tasks by decomposing lengthy tasks into shorter subtasks. We propose HAC Explore (HACx), a new method that combines these approaches by integrating the exploration bonus method Random Network Distillation (RND) into the hierarchical approach Hierarchical Actor-Critic (HAC). HACx outperforms either component method on its own, as well as an existing approach to combining hierarchy and exploration, in a set of difficult simulated robotics tasks. HACx is the first RL method to solve a sparse reward, continuous-control task that requires over 1,000 actions.

Via

Access Paper or Ask Questions