Alert button
Picture for Vasanth Sarathy

Vasanth Sarathy

Alert button

LgTS: Dynamic Task Sampling using LLM-generated sub-goals for Reinforcement Learning Agents

Oct 14, 2023
Yash Shukla, Wenchang Gao, Vasanth Sarathy, Alvaro Velasquez, Robert Wright, Jivko Sinapov

Recent advancements in reasoning abilities of Large Language Models (LLM) has promoted their usage in problems that require high-level planning for robots and artificial agents. However, current techniques that utilize LLMs for such planning tasks make certain key assumptions such as, access to datasets that permit finetuning, meticulously engineered prompts that only provide relevant and essential information to the LLM, and most importantly, a deterministic approach to allow execution of the LLM responses either in the form of existing policies or plan operators. In this work, we propose LgTS (LLM-guided Teacher-Student learning), a novel approach that explores the planning abilities of LLMs to provide a graphical representation of the sub-goals to a reinforcement learning (RL) agent that does not have access to the transition dynamics of the environment. The RL agent uses Teacher-Student learning algorithm to learn a set of successful policies for reaching the goal state from the start state while simultaneously minimizing the number of environmental interactions. Unlike previous methods that utilize LLMs, our approach does not assume access to a propreitary or a fine-tuned LLM, nor does it require pre-trained policies that achieve the sub-goals proposed by the LLM. Through experiments on a gridworld based DoorKey domain and a search-and-rescue inspired domain, we show that generating a graphical structure of sub-goals helps in learning policies for the LLM proposed sub-goals and the Teacher-Student learning algorithm minimizes the number of environment interactions when the transition dynamics are unknown.

Viaarxiv icon

RAPid-Learn: A Framework for Learning to Recover for Handling Novelties in Open-World Environments

Jun 24, 2022
Shivam Goel, Yash Shukla, Vasanth Sarathy, Matthias Scheutz, Jivko Sinapov

Figure 1 for RAPid-Learn: A Framework for Learning to Recover for Handling Novelties in Open-World Environments
Figure 2 for RAPid-Learn: A Framework for Learning to Recover for Handling Novelties in Open-World Environments
Figure 3 for RAPid-Learn: A Framework for Learning to Recover for Handling Novelties in Open-World Environments
Figure 4 for RAPid-Learn: A Framework for Learning to Recover for Handling Novelties in Open-World Environments

We propose RAPid-Learn: Learning to Recover and Plan Again, a hybrid planning and learning method, to tackle the problem of adapting to sudden and unexpected changes in an agent's environment (i.e., novelties). RAPid-Learn is designed to formulate and solve modifications to a task's Markov Decision Process (MDPs) on-the-fly and is capable of exploiting domain knowledge to learn any new dynamics caused by the environmental changes. It is capable of exploiting the domain knowledge to learn action executors which can be further used to resolve execution impasses, leading to a successful plan execution. This novelty information is reflected in its updated domain model. We demonstrate its efficacy by introducing a wide variety of novelties in a gridworld environment inspired by Minecraft, and compare our algorithm with transfer learning baselines from the literature. Our method is (1) effective even in the presence of multiple novelties, (2) more sample efficient than transfer learning RL baselines, and (3) robust to incomplete model information, as opposed to pure symbolic planning approaches.

* Proceedings of the IEEE Conference on Development and Learning (ICDL 2022) 
Viaarxiv icon

From Unstructured Text to Causal Knowledge Graphs: A Transformer-Based Approach

Feb 23, 2022
Scott Friedman, Ian Magnusson, Vasanth Sarathy, Sonja Schmer-Galunder

Figure 1 for From Unstructured Text to Causal Knowledge Graphs: A Transformer-Based Approach
Figure 2 for From Unstructured Text to Causal Knowledge Graphs: A Transformer-Based Approach
Figure 3 for From Unstructured Text to Causal Knowledge Graphs: A Transformer-Based Approach
Figure 4 for From Unstructured Text to Causal Knowledge Graphs: A Transformer-Based Approach

Qualitative causal relationships compactly express the direction, dependency, temporal constraints, and monotonicity constraints of discrete or continuous interactions in the world. In everyday or academic language, we may express interactions between quantities (e.g., sleep decreases stress), between discrete events or entities (e.g., a protein inhibits another protein's transcription), or between intentional or functional factors (e.g., hospital patients pray to relieve their pain). Extracting and representing these diverse causal relations are critical for cognitive systems that operate in domains spanning from scientific discovery to social science. This paper presents a transformer-based NLP architecture that jointly extracts knowledge graphs including (1) variables or factors described in language, (2) qualitative causal relationships over these variables, (3) qualifiers and magnitudes that constrain these causal relationships, and (4) word senses to localize each extracted node within a large ontology. We do not claim that our transformer-based architecture is itself a cognitive system; however, we provide evidence of its accurate knowledge graph extraction in real-world domains and the practicality of its resulting knowledge graphs for cognitive systems that perform graph-based reasoning. We demonstrate this approach and include promising results in two use cases, processing textual inputs from academic publications, news articles, and social media.

* arXiv admin note: substantial text overlap with arXiv:2108.13304 
Viaarxiv icon

SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning

Dec 24, 2020
Vasanth Sarathy, Daniel Kasenberg, Shivam Goel, Jivko Sinapov, Matthias Scheutz

Figure 1 for SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning
Figure 2 for SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning
Figure 3 for SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning
Figure 4 for SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning

Symbolic planning models allow decision-making agents to sequence actions in arbitrary ways to achieve a variety of goals in dynamic domains. However, they are typically handcrafted and tend to require precise formulations that are not robust to human error. Reinforcement learning (RL) approaches do not require such models, and instead learn domain dynamics by exploring the environment and collecting rewards. However, RL approaches tend to require millions of episodes of experience and often learn policies that are not easily transferable to other tasks. In this paper, we address one aspect of the open problem of integrating these approaches: how can decision-making agents resolve discrepancies in their symbolic planning models while attempting to accomplish goals? We propose an integrated framework named SPOTTER that uses RL to augment and support ("spot") a planning agent by discovering new operators needed by the agent to accomplish goals that are initially unreachable for the agent. SPOTTER outperforms pure-RL approaches while also discovering transferable symbolic knowledge and does not require supervision, successful plan traces or any a priori knowledge about the missing planning operator.

* Accepted to AAMAS 2021 
Viaarxiv icon

When Exceptions are the Norm: Exploring the Role of Consent in HRI

Feb 04, 2019
Vasanth Sarathy, Thomas Arnold, Matthias Scheutz

HRI researchers have made major strides in developing robotic architectures that are capable of reading a limited set of social cues and producing behaviors that enhance their likeability and feeling of comfort amongst humans. However, the cues in these models are fairly direct and the interactions largely dyadic. To capture the normative qualities of interaction more robustly, we propose consent as a distinct, critical area for HRI research. Convening important insights in existing HRI work around topics like touch, proxemics, gaze, and moral norms, the notion of consent reveals key expectations that can shape how a robot acts in social space. By sorting various kinds of consent through social and legal doctrine, we delineate empirical and technical questions to meet consent challenges faced in major application domains and robotic roles. Attention to consent could show, for example, how extraordinary, norm-violating actions can be justified by agents and accepted by those around them. We argue that operationalizing ideas from legal scholarship can better guide how robotic systems might cultivate and sustain proper forms of consent.

Viaarxiv icon

Quasi-Dilemmas for Artificial Moral Agents

Jul 06, 2018
Daniel Kasenberg, Vasanth Sarathy, Thomas Arnold, Matthias Scheutz, Tom Williams

In this paper we describe moral quasi-dilemmas (MQDs): situations similar to moral dilemmas, but in which an agent is unsure whether exploring the plan space or the world may reveal a course of action that satisfies all moral requirements. We argue that artificial moral agents (AMAs) should be built to handle MQDs (in particular, by exploring the plan space rather than immediately accepting the inevitability of the moral dilemma), and that MQDs may be useful for evaluating AMA architectures.

* Accepted to the International Conference on Robot Ethics and Standards (ICRES), 2018 
Viaarxiv icon

The MacGyver Test - A Framework for Evaluating Machine Resourcefulness and Creative Problem Solving

Apr 26, 2017
Vasanth Sarathy, Matthias Scheutz

Current measures of machine intelligence are either difficult to evaluate or lack the ability to test a robot's problem-solving capacity in open worlds. We propose a novel evaluation framework based on the formal notion of MacGyver Test which provides a practical way for assessing the resilience and resourcefulness of artificial agents.

Viaarxiv icon

Enabling Basic Normative HRI in a Cognitive Robotic Architecture

Feb 11, 2016
Vasanth Sarathy, Jason R. Wilson, Thomas Arnold, Matthias Scheutz

Figure 1 for Enabling Basic Normative HRI in a Cognitive Robotic Architecture

Collaborative human activities are grounded in social and moral norms, which humans consciously and subconsciously use to guide and constrain their decision-making and behavior, thereby strengthening their interactions and preventing emotional and physical harm. This type of norm-based processing is also critical for robots in many human-robot interaction scenarios (e.g., when helping elderly and disabled persons in assisted living facilities, or assisting humans in assembly tasks in factories or even the space station). In this position paper, we will briefly describe how several components in an integrated cognitive architecture can be used to implement processes that are required for normative human-robot interactions, especially in collaborative tasks where actions and situations could potentially be perceived as threatening and thus need a change in course of action to mitigate the perceived threats.

* Presented at "2nd Workshop on Cognitive Architectures for Social Human-Robot Interaction 2016 (arXiv:1602.01868)" 
Viaarxiv icon