Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Siddharth Karamcheti

Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

Oct 14, 2022

Kanishk Gandhi, Siddharth Karamcheti, Madeline Liao, Dorsa Sadigh

Figure 1 for Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

Figure 2 for Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

Figure 3 for Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

Figure 4 for Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

Abstract:Imitation learning from human-provided demonstrations is a strong approach for learning policies for robot manipulation. While the ideal dataset for imitation learning is homogenous and low-variance -- reflecting a single, optimal method for performing a task -- natural human behavior has a great deal of heterogeneity, with several optimal ways to demonstrate a task. This multimodality is inconsequential to human users, with task variations manifesting as subconscious choices; for example, reaching down, then across to grasp an object, versus reaching across, then down. Yet, this mismatch presents a problem for interactive imitation learning, where sequences of users improve on a policy by iteratively collecting new, possibly conflicting demonstrations. To combat this problem of demonstrator incompatibility, this work designs an approach for 1) measuring the compatibility of a new demonstration given a base policy, and 2) actively eliciting more compatible demonstrations from new users. Across two simulation tasks requiring long-horizon, dexterous manipulation and a real-world "food plating" task with a Franka Emika Panda arm, we show that we can both identify incompatible demonstrations via post-hoc filtering, and apply our compatibility measure to actively elicit compatible demonstrations from new users, leading to improved task success rates across simulated and real environments.

* To appear at the 6th Annual Conference on Robot Learning (CoRL) 2022

Via

Access Paper or Ask Questions

LILA: Language-Informed Latent Actions

Nov 05, 2021

Siddharth Karamcheti, Megha Srivastava, Percy Liang, Dorsa Sadigh

Figure 1 for LILA: Language-Informed Latent Actions

Figure 2 for LILA: Language-Informed Latent Actions

Figure 3 for LILA: Language-Informed Latent Actions

Figure 4 for LILA: Language-Informed Latent Actions

Abstract:We introduce Language-Informed Latent Actions (LILA), a framework for learning natural language interfaces in the context of human-robot collaboration. LILA falls under the shared autonomy paradigm: in addition to providing discrete language inputs, humans are given a low-dimensional controller $-$ e.g., a 2 degree-of-freedom (DoF) joystick that can move left/right and up/down $-$ for operating the robot. LILA learns to use language to modulate this controller, providing users with a language-informed control space: given an instruction like "place the cereal bowl on the tray," LILA may learn a 2-DoF space where one dimension controls the distance from the robot's end-effector to the bowl, and the other dimension controls the robot's end-effector pose relative to the grasp point on the bowl. We evaluate LILA with real-world user studies, where users can provide a language instruction while operating a 7-DoF Franka Emika Panda Arm to complete a series of complex manipulation tasks. We show that LILA models are not only more sample efficient and performant than imitation learning and end-effector control baselines, but that they are also qualitatively preferred by users.

* Accepted at the 5th Conference on Robot Learning (CoRL). Joint first authorship. 21 Pages, 11 Figures

Via

Access Paper or Ask Questions

On the Opportunities and Risks of Foundation Models

Aug 18, 2021

Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill(+104 more)

Figure 1 for On the Opportunities and Risks of Foundation Models

Figure 2 for On the Opportunities and Risks of Foundation Models

Figure 3 for On the Opportunities and Risks of Foundation Models

Figure 4 for On the Opportunities and Risks of Foundation Models

Abstract:AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

* Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI)

Via

Access Paper or Ask Questions

Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering

Jul 06, 2021

Siddharth Karamcheti, Ranjay Krishna, Li Fei-Fei, Christopher D. Manning

Figure 1 for Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering

Figure 2 for Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering

Figure 3 for Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering

Figure 4 for Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering

Abstract:Active learning promises to alleviate the massive data needs of supervised machine learning: it has successfully improved sample efficiency by an order of magnitude on traditional tasks like topic classification and object recognition. However, we uncover a striking contrast to this promise: across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection. To understand this discrepancy, we profile 8 active learning methods on a per-example basis, and identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn (e.g., questions that ask about text in images or require external knowledge). Through systematic ablation experiments and qualitative visualizations, we verify that collective outliers are a general phenomenon responsible for degrading pool-based active learning. Notably, we show that active learning sample efficiency increases significantly as the number of collective outliers in the active learning pool decreases. We conclude with a discussion and prescriptive recommendations for mitigating the effects of these outliers in future work.

* Accepted at ACL-IJCNLP 2021. 17 pages, 16 Figures

Via

Access Paper or Ask Questions

Targeted Data Acquisition for Evolving Negotiation Agents

Jun 16, 2021

Minae Kwon, Siddharth Karamcheti, Mariano-Florentino Cuellar, Dorsa Sadigh

Figure 1 for Targeted Data Acquisition for Evolving Negotiation Agents

Figure 2 for Targeted Data Acquisition for Evolving Negotiation Agents

Figure 3 for Targeted Data Acquisition for Evolving Negotiation Agents

Figure 4 for Targeted Data Acquisition for Evolving Negotiation Agents

Abstract:Successful negotiators must learn how to balance optimizing for self-interest and cooperation. Yet current artificial negotiation agents often heavily depend on the quality of the static datasets they were trained on, limiting their capacity to fashion an adaptive response balancing self-interest and cooperation. For this reason, we find that these agents can achieve either high utility or cooperation, but not both. To address this, we introduce a targeted data acquisition framework where we guide the exploration of a reinforcement learning agent using annotations from an expert oracle. The guided exploration incentivizes the learning agent to go beyond its static dataset and develop new negotiation strategies. We show that this enables our agents to obtain higher-reward and more Pareto-optimal solutions when negotiating with both simulated and human partners compared to standard supervised learning and reinforcement learning methods. This trend additionally holds when comparing agents using our targeted data acquisition framework to variants of agents trained with a mix of supervised learning and reinforcement learning, or to agents using tailored reward functions that explicitly optimize for utility and Pareto-optimality.

* The Thirty-eighth International Conference on Machine Learning

Via

Access Paper or Ask Questions

Learning Visually Guided Latent Actions for Assistive Teleoperation

May 02, 2021

Siddharth Karamcheti, Albert J. Zhai, Dylan P. Losey, Dorsa Sadigh

Figure 1 for Learning Visually Guided Latent Actions for Assistive Teleoperation

Figure 2 for Learning Visually Guided Latent Actions for Assistive Teleoperation

Figure 3 for Learning Visually Guided Latent Actions for Assistive Teleoperation

Figure 4 for Learning Visually Guided Latent Actions for Assistive Teleoperation

Abstract:It is challenging for humans -- particularly those living with physical disabilities -- to control high-dimensional, dexterous robots. Prior work explores learning embedding functions that map a human's low-dimensional inputs (e.g., via a joystick) to complex, high-dimensional robot actions for assistive teleoperation; however, a central problem is that there are many more high-dimensional actions than available low-dimensional inputs. To extract the correct action and maximally assist their human controller, robots must reason over their context: for example, pressing a joystick down when interacting with a coffee cup indicates a different action than when interacting with knife. In this work, we develop assistive robots that condition their latent embeddings on visual inputs. We explore a spectrum of visual encoders and show that incorporating object detectors pretrained on small amounts of cheap, easy-to-collect structured data enables i) accurately and robustly recognizing the current context and ii) generalizing control embeddings to new objects and tasks. In user studies with a high-dimensional physical robot arm, participants leverage this approach to perform new tasks with unseen objects. Our results indicate that structured visual representations improve few-shot performance and are subjectively preferred by users.

* Accepted at Learning for Dynamics and Control (L4DC) 2021. 12 pages, 4 figures

Via

Access Paper or Ask Questions

ELLA: Exploration through Learned Language Abstraction

Mar 10, 2021

Suvir Mirchandani, Siddharth Karamcheti, Dorsa Sadigh

Figure 1 for ELLA: Exploration through Learned Language Abstraction

Figure 2 for ELLA: Exploration through Learned Language Abstraction

Figure 3 for ELLA: Exploration through Learned Language Abstraction

Figure 4 for ELLA: Exploration through Learned Language Abstraction

Abstract:Building agents capable of understanding language instructions is critical to effective and robust human-AI collaboration. Recent work focuses on training these instruction following agents via reinforcement learning in environments with synthetic language; however, these instructions often define long-horizon, sparse-reward tasks, and learning policies requires many episodes of experience. To this end, we introduce ELLA: Exploration through Learned Language Abstraction, a reward shaping approach that correlates high-level instructions with simpler low-level instructions to enrich the sparse rewards afforded by the environment. ELLA has two key elements: 1) A termination classifier that identifies when agents complete low-level instructions, and 2) A relevance classifier that correlates low-level instructions with success on high-level tasks. We learn the termination classifier offline from pairs of instructions and terminal states. Notably, in departure from prior work in language and abstraction, we learn the relevance classifier online, without relying on an explicit decomposition of high-level instructions to low-level instructions. On a suite of complex grid world environments with varying instruction complexities and reward sparsity, ELLA shows a significant gain in sample efficiency across several environments compared to competitive language-based reward shaping and no-shaping methods.

* 13 pages, 8 figures

Via

Access Paper or Ask Questions

Learning Adaptive Language Interfaces through Decomposition

Oct 11, 2020

Siddharth Karamcheti, Dorsa Sadigh, Percy Liang

Figure 1 for Learning Adaptive Language Interfaces through Decomposition

Figure 2 for Learning Adaptive Language Interfaces through Decomposition

Figure 3 for Learning Adaptive Language Interfaces through Decomposition

Figure 4 for Learning Adaptive Language Interfaces through Decomposition

Abstract:Our goal is to create an interactive natural language interface that efficiently and reliably learns from users to complete tasks in simulated robotics settings. We introduce a neural semantic parsing system that learns new high-level abstractions through decomposition: users interactively teach the system by breaking down high-level utterances describing novel behavior into low-level steps that it can understand. Unfortunately, existing methods either rely on grammars which parse sentences with limited flexibility, or neural sequence-to-sequence models that do not learn efficiently or reliably from individual examples. Our approach bridges this gap, demonstrating the flexibility of modern neural systems, as well as the one-shot reliable generalization of grammar-based methods. Our crowdsourced interactive experiments suggest that over time, users complete complex tasks more efficiently while using our system by leveraging what they just taught. At the same time, getting users to trust the system enough to be incentivized to teach high-level utterances is still an ongoing challenge. We end with a discussion of some of the obstacles we need to overcome to fully realize the potential of the interactive paradigm.

* Accepted at the 1st Workshop for Interactive and Executable Semantic Parsing (IntEx-SemPar) @ EMNLP 2020. 11 pages, 5 figures

Via

Access Paper or Ask Questions

Generating Interactive Worlds with Text

Dec 04, 2019

Angela Fan, Jack Urbanek, Pratik Ringshia, Emily Dinan, Emma Qian, Siddharth Karamcheti, Shrimai Prabhumoye, Douwe Kiela, Tim Rocktaschel, Arthur Szlam(+1 more)

Figure 1 for Generating Interactive Worlds with Text

Figure 2 for Generating Interactive Worlds with Text

Figure 3 for Generating Interactive Worlds with Text

Figure 4 for Generating Interactive Worlds with Text

Abstract:Procedurally generating cohesive and interesting game environments is challenging and time-consuming. In order for the relationships between the game elements to be natural, common-sense has to be encoded into arrangement of the elements. In this work, we investigate a machine learning approach for world creation using content from the multi-player text adventure game environment LIGHT. We introduce neural network based models to compositionally arrange locations, characters, and objects into a coherent whole. In addition to creating worlds based on existing elements, our models can generate new game content. Humans can also leverage our models to interactively aid in worldbuilding. We show that the game environments created with our approach are cohesive, diverse, and preferred by human evaluators compared to other machine learning based world construction algorithms.

Via

Access Paper or Ask Questions

Finding Generalizable Evidence by Learning to Convince Q&A Models

Sep 12, 2019

Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho

Figure 1 for Finding Generalizable Evidence by Learning to Convince Q&A Models

Figure 2 for Finding Generalizable Evidence by Learning to Convince Q&A Models

Figure 3 for Finding Generalizable Evidence by Learning to Convince Q&A Models

Figure 4 for Finding Generalizable Evidence by Learning to Convince Q&A Models

Abstract:We propose a system that finds the strongest supporting evidence for a given answer to a question, using passage-based question-answering (QA) as a testbed. We train evidence agents to select the passage sentences that most convince a pretrained QA model of a given answer, if the QA model received those sentences instead of the full passage. Rather than finding evidence that convinces one model alone, we find that agents select evidence that generalizes; agent-chosen evidence increases the plausibility of the supported answer, as judged by other QA models and humans. Given its general nature, this approach improves QA in a robust manner: using agent-selected evidence (i) humans can correctly answer questions with only ~20% of the full passage and (ii) QA models can generalize to longer passages and harder questions.

* EMNLP 2019. Code available at https://github.com/ethanjperez/convince

Via

Access Paper or Ask Questions