Alert button
Picture for Jonathan Balloch

Jonathan Balloch

Alert button

A Simple Way to Incorporate Novelty Detection in World Models

Oct 12, 2023
Geigh Zollicoffer, Kenneth Eaton, Jonathan Balloch, Julia Kim, Mark O. Riedl, Robert Wright

Figure 1 for A Simple Way to Incorporate Novelty Detection in World Models
Figure 2 for A Simple Way to Incorporate Novelty Detection in World Models
Figure 3 for A Simple Way to Incorporate Novelty Detection in World Models
Figure 4 for A Simple Way to Incorporate Novelty Detection in World Models

Reinforcement learning (RL) using world models has found significant recent successes. However, when a sudden change to world mechanics or properties occurs then agent performance and reliability can dramatically decline. We refer to the sudden change in visual properties or state transitions as {\em novelties}. Implementing novelty detection within generated world model frameworks is a crucial task for protecting the agent when deployed. In this paper, we propose straightforward bounding approaches to incorporate novelty detection into world model RL agents, by utilizing the misalignment of the world model's hallucinated states and the true observed states as an anomaly score. We first provide an ontology of novelty detection relevant to sequential decision making, then we provide effective approaches to detecting novelties in a distribution of transitions learned by an agent in a world model. Finally, we show the advantage of our work in a novel environment compared to traditional machine learning novelty detection methods as well as currently accepted RL focused novelty detection algorithms.

Viaarxiv icon

Neuro-Symbolic World Models for Adapting to Open World Novelty

Jan 16, 2023
Jonathan Balloch, Zhiyu Lin, Robert Wright, Xiangyu Peng, Mustafa Hussain, Aarun Srinivas, Julia Kim, Mark O. Riedl

Figure 1 for Neuro-Symbolic World Models for Adapting to Open World Novelty
Figure 2 for Neuro-Symbolic World Models for Adapting to Open World Novelty
Figure 3 for Neuro-Symbolic World Models for Adapting to Open World Novelty
Figure 4 for Neuro-Symbolic World Models for Adapting to Open World Novelty

Open-world novelty--a sudden change in the mechanics or properties of an environment--is a common occurrence in the real world. Novelty adaptation is an agent's ability to improve its policy performance post-novelty. Most reinforcement learning (RL) methods assume that the world is a closed, fixed process. Consequentially, RL policies adapt inefficiently to novelties. To address this, we introduce WorldCloner, an end-to-end trainable neuro-symbolic world model for rapid novelty adaptation. WorldCloner learns an efficient symbolic representation of the pre-novelty environment transitions, and uses this transition model to detect novelty and efficiently adapt to novelty in a single-shot fashion. Additionally, WorldCloner augments the policy learning process using imagination-based adaptation, where the world model simulates transitions of the post-novelty environment to help the policy adapt. By blending ''imagined'' transitions with interactions in the post-novelty environment, performance can be recovered with fewer total environment interactions. Using environments designed for studying novelty in sequential decision-making problems, we show that the symbolic world model helps its neural policy adapt more efficiently than model-based and model-based neural-only reinforcement learning methods.

* 9 pages, 8 figures, Extended Abstract accepted for presentation at AAMAS 2023 
Viaarxiv icon

NovGrid: A Flexible Grid World for Evaluating Agent Response to Novelty

Mar 23, 2022
Jonathan Balloch, Zhiyu Lin, Mustafa Hussain, Aarun Srinivas, Robert Wright, Xiangyu Peng, Julia Kim, Mark Riedl

Figure 1 for NovGrid: A Flexible Grid World for Evaluating Agent Response to Novelty
Figure 2 for NovGrid: A Flexible Grid World for Evaluating Agent Response to Novelty
Figure 3 for NovGrid: A Flexible Grid World for Evaluating Agent Response to Novelty
Figure 4 for NovGrid: A Flexible Grid World for Evaluating Agent Response to Novelty

A robust body of reinforcement learning techniques have been developed to solve complex sequential decision making problems. However, these methods assume that train and evaluation tasks come from similarly or identically distributed environments. This assumption does not hold in real life where small novel changes to the environment can make a previously learned policy fail or introduce simpler solutions that might never be found. To that end we explore the concept of {\em novelty}, defined in this work as the sudden change to the mechanics or properties of environment. We provide an ontology of for novelties most relevant to sequential decision making, which distinguishes between novelties that affect objects versus actions, unary properties versus non-unary relations, and the distribution of solutions to a task. We introduce NovGrid, a novelty generation framework built on MiniGrid, acting as a toolkit for rapidly developing and evaluating novelty-adaptation-enabled reinforcement learning techniques. Along with the core NovGrid we provide exemplar novelties aligned with our ontology and instantiate them as novelty templates that can be applied to many MiniGrid-compliant environments. Finally, we present a set of metrics built into our framework for the evaluation of novelty-adaptation-enabled machine-learning techniques, and show characteristics of a baseline RL model using these metrics.

* 7 pages, 4 figures, AAAI Spring Symposium 2022 on Designing Artificial Intelligence for Open Worlds (Long Oral) 
Viaarxiv icon

Automated Story Generation as Question-Answering

Dec 07, 2021
Louis Castricato, Spencer Frazier, Jonathan Balloch, Nitya Tarakad, Mark Riedl

Figure 1 for Automated Story Generation as Question-Answering
Figure 2 for Automated Story Generation as Question-Answering
Figure 3 for Automated Story Generation as Question-Answering

Neural language model-based approaches to automated story generation suffer from two important limitations. First, language model-based story generators generally do not work toward a given goal or ending. Second, they often lose coherence as the story gets longer. We propose a novel approach to automated story generation that treats the problem as one of generative question-answering. Our proposed story generation system starts with sentences encapsulating the final event of the story. The system then iteratively (1) analyzes the text describing the most recent event, (2) generates a question about "why" a character is doing the thing they are doing in the event, and then (3) attempts to generate another, preceding event that answers this question.

Viaarxiv icon

Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning

Jun 17, 2021
James Smith, Yen-Chang Hsu, Jonathan Balloch, Yilin Shen, Hongxia Jin, Zsolt Kira

Figure 1 for Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning
Figure 2 for Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning
Figure 3 for Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning
Figure 4 for Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning

Modern computer vision applications suffer from catastrophic forgetting when incrementally learning new concepts over time. The most successful approaches to alleviate this forgetting require extensive replay of previously seen data, which is problematic when memory constraints or data legality concerns exist. In this work, we consider the high-impact problem of Data-Free Class-Incremental Learning (DFCIL), where an incremental learning agent must learn new concepts over time without storing generators or training data from past tasks. One approach for DFCIL is to replay synthetic images produced by inverting a frozen copy of the learner's classification model, but we show this approach fails for common class-incremental benchmarks when using standard distillation strategies. We diagnose the cause of this failure and propose a novel incremental distillation strategy for DFCIL, contributing a modified cross-entropy training and importance-weighted feature distillation, and show that our method results in up to a 25.1% increase in final task accuracy (absolute difference) compared to SOTA DFCIL methods for common class-incremental benchmarks. Our method even outperforms several standard replay based methods which store a coreset of images.

Viaarxiv icon

Fabula Entropy Indexing: Objective Measures of Story Coherence

Mar 23, 2021
Louis Castricato, Spencer Frazier, Jonathan Balloch, Mark Riedl

Figure 1 for Fabula Entropy Indexing: Objective Measures of Story Coherence
Figure 2 for Fabula Entropy Indexing: Objective Measures of Story Coherence

Automated story generation remains a difficult area of research because it lacks strong objective measures. Generated stories may be linguistically sound, but in many cases suffer poor narrative coherence required for a compelling, logically-sound story. To address this, we present Fabula Entropy Indexing (FEI), an evaluation method to assess story coherence by measuring the degree to which human participants agree with each other when answering true/false questions about stories. We devise two theoretically grounded measures of reader question-answering entropy, the entropy of world coherence (EWC), and the entropy of transitional coherence (ETC), focusing on global and local coherence, respectively. We evaluate these metrics by testing them on human-written stories and comparing against the same stories that have been corrupted to introduce incoherencies. We show that in these controlled studies, our entropy indices provide a reliable objective measure of story coherence.

Viaarxiv icon

Memory-Efficient Semi-Supervised Continual Learning: The World is its Own Replay Buffer

Jan 23, 2021
James Smith, Jonathan Balloch, Yen-Chang Hsu, Zsolt Kira

Figure 1 for Memory-Efficient Semi-Supervised Continual Learning: The World is its Own Replay Buffer
Figure 2 for Memory-Efficient Semi-Supervised Continual Learning: The World is its Own Replay Buffer
Figure 3 for Memory-Efficient Semi-Supervised Continual Learning: The World is its Own Replay Buffer
Figure 4 for Memory-Efficient Semi-Supervised Continual Learning: The World is its Own Replay Buffer

Rehearsal is a critical component for class-incremental continual learning, yet it requires a substantial memory budget. Our work investigates whether we can significantly reduce this memory budget by leveraging unlabeled data from an agent's environment in a realistic and challenging continual learning paradigm. Specifically, we explore and formalize a novel semi-supervised continual learning (SSCL) setting, where labeled data is scarce yet non-i.i.d. unlabeled data from the agent's environment is plentiful. Importantly, data distributions in the SSCL setting are realistic and therefore reflect object class correlations between, and among, the labeled and unlabeled data distributions. We show that a strategy built on pseudo-labeling, consistency regularization, Out-of-Distribution (OoD) detection, and knowledge distillation reduces forgetting in this setting. Our approach, DistillMatch, increases performance over the state-of-the-art by no less than 8.7% average task accuracy and up to a 54.5% increase in average task accuracy in SSCL CIFAR-100 experiments. Moreover, we demonstrate that DistillMatch can save up to 0.23 stored images per processed unlabeled image compared to the next best method which only saves 0.08. Our results suggest that focusing on realistic correlated distributions is a significantly new perspective, which accentuates the importance of leveraging the world's structure as a continual learning strategy.

Viaarxiv icon

Taking Recoveries to Task: Recovery-Driven Development for Recipe-based Robot Tasks

Jan 28, 2020
Siddhartha Banerjee, Angel Daruna, David Kent, Weiyu Liu, Jonathan Balloch, Abhinav Jain, Akshay Krishnan, Muhammad Asif Rana, Harish Ravichandar, Binit Shah, Nithin Shrivatsav, Sonia Chernova

Figure 1 for Taking Recoveries to Task: Recovery-Driven Development for Recipe-based Robot Tasks
Figure 2 for Taking Recoveries to Task: Recovery-Driven Development for Recipe-based Robot Tasks
Figure 3 for Taking Recoveries to Task: Recovery-Driven Development for Recipe-based Robot Tasks
Figure 4 for Taking Recoveries to Task: Recovery-Driven Development for Recipe-based Robot Tasks

Robot task execution when situated in real-world environments is fragile. As such, robot architectures must rely on robust error recovery, adding non-trivial complexity to highly-complex robot systems. To handle this complexity in development, we introduce Recovery-Driven Development (RDD), an iterative task scripting process that facilitates rapid task and recovery development by leveraging hierarchical specification, separation of nominal task and recovery development, and situated testing. We validate our approach with our challenge-winning mobile manipulator software architecture developed using RDD for the FetchIt! Challenge at the IEEE 2019 International Conference on Robotics and Automation. We attribute the success of our system to the level of robustness achieved using RDD, and conclude with lessons learned for developing such systems.

* Published and presented at International Symposium on Robotics Research (ISRR), 2019 in Hanoi, Vietnam 
Viaarxiv icon

Tool Macgyvering: Tool Construction Using Geometric Reasoning

Feb 10, 2019
Lakshmi Nair, Jonathan Balloch, Sonia Chernova

Figure 1 for Tool Macgyvering: Tool Construction Using Geometric Reasoning
Figure 2 for Tool Macgyvering: Tool Construction Using Geometric Reasoning
Figure 3 for Tool Macgyvering: Tool Construction Using Geometric Reasoning
Figure 4 for Tool Macgyvering: Tool Construction Using Geometric Reasoning

MacGyvering is defined as creating or repairing something in an inventive or improvised way by utilizing objects that are available at hand. In this paper, we explore a subset of Macgyvering problems involving tool construction, i.e., creating tools from parts available in the environment. We formalize the overall problem domain of tool Macgyvering, introducing three levels of complexity for tool construction and substitution problems, and presenting a novel computational framework aimed at solving one level of the tool Macgyvering problem, specifically contributing a novel algorithm for tool construction based on geometric reasoning. We validate our approach by constructing three tools using a 7-DOF robot arm.

* Video demonstration available at: https://www.youtube.com/channel/UCxnm8iu1TS75YNXcAiI-nEw Conference: Accepted to International Conference on Robotics and Automation 2019 
Viaarxiv icon