Alert button
Picture for Justin Wong

Justin Wong

Alert button

The Wisdom of Hindsight Makes Language Models Better Instruction Followers

Feb 10, 2023
Tianjun Zhang, Fangchen Liu, Justin Wong, Pieter Abbeel, Joseph E. Gonzalez

Figure 1 for The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Figure 2 for The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Figure 3 for The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Figure 4 for The Wisdom of Hindsight Makes Language Models Better Instruction Followers

Reinforcement learning has seen wide success in finetuning large language models to better align with instructions via human feedback. The so-called algorithm, Reinforcement Learning with Human Feedback (RLHF) demonstrates impressive performance on the GPT series models. However, the underlying Reinforcement Learning (RL) algorithm is complex and requires an additional training pipeline for reward and value networks. In this paper, we consider an alternative approach: converting feedback to instruction by relabeling the original one and training the model for better alignment in a supervised manner. Such an algorithm doesn't require any additional parameters except for the original language model and maximally reuses the pretraining pipeline. To achieve this, we formulate instruction alignment problem for language models as a goal-reaching problem in decision making. We propose Hindsight Instruction Relabeling (HIR), a novel algorithm for aligning language models with instructions. The resulting two-stage algorithm shed light to a family of reward-free approaches that utilize the hindsightly relabeled instructions based on feedback. We evaluate the performance of HIR extensively on 12 challenging BigBench reasoning tasks and show that HIR outperforms the baseline algorithms and is comparable to or even surpasses supervised finetuning.

Viaarxiv icon

Context-Aware Streaming Perception in Dynamic Environments

Aug 16, 2022
Gur-Eyal Sela, Ionel Gog, Justin Wong, Kumar Krishna Agrawal, Xiangxi Mo, Sukrit Kalra, Peter Schafhalter, Eric Leong, Xin Wang, Bharathan Balaji, Joseph Gonzalez, Ion Stoica

Figure 1 for Context-Aware Streaming Perception in Dynamic Environments
Figure 2 for Context-Aware Streaming Perception in Dynamic Environments
Figure 3 for Context-Aware Streaming Perception in Dynamic Environments
Figure 4 for Context-Aware Streaming Perception in Dynamic Environments

Efficient vision works maximize accuracy under a latency budget. These works evaluate accuracy offline, one image at a time. However, real-time vision applications like autonomous driving operate in streaming settings, where ground truth changes between inference start and finish. This results in a significant accuracy drop. Therefore, a recent work proposed to maximize accuracy in streaming settings on average. In this paper, we propose to maximize streaming accuracy for every environment context. We posit that scenario difficulty influences the initial (offline) accuracy difference, while obstacle displacement in the scene affects the subsequent accuracy degradation. Our method, Octopus, uses these scenario properties to select configurations that maximize streaming accuracy at test time. Our method improves tracking performance (S-MOTA) by 7.4% over the conventional static approach. Further, performance improvement using our method comes in addition to, and not instead of, advances in offline accuracy.

* 26 pages, 10 figures, to be published in ECCV 2022 
Viaarxiv icon

Addressing the IEEE AV Test Challenge with Scenic and VerifAI

Aug 20, 2021
Kesav Viswanadha, Francis Indaheng, Justin Wong, Edward Kim, Ellen Kalvan, Yash Pant, Daniel J. Fremont, Sanjit A. Seshia

Figure 1 for Addressing the IEEE AV Test Challenge with Scenic and VerifAI
Figure 2 for Addressing the IEEE AV Test Challenge with Scenic and VerifAI
Figure 3 for Addressing the IEEE AV Test Challenge with Scenic and VerifAI
Figure 4 for Addressing the IEEE AV Test Challenge with Scenic and VerifAI

This paper summarizes our formal approach to testing autonomous vehicles (AVs) in simulation for the IEEE AV Test Challenge. We demonstrate a systematic testing framework leveraging our previous work on formally-driven simulation for intelligent cyber-physical systems. First, to model and generate interactive scenarios involving multiple agents, we used Scenic, a probabilistic programming language for specifying scenarios. A Scenic program defines an abstract scenario as a distribution over configurations of physical objects and their behaviors over time. Sampling from an abstract scenario yields many different concrete scenarios which can be run as test cases for the AV. Starting from a Scenic program encoding an abstract driving scenario, we can use the VerifAI toolkit to search within the scenario for failure cases with respect to multiple AV evaluation metrics. We demonstrate the effectiveness of our testing framework by identifying concrete failure scenarios for an open-source autopilot, Apollo, starting from a variety of realistic traffic scenarios.

* Accepted to the IEEE AITest Conference 2021 
Viaarxiv icon

Scaled-Time-Attention Robust Edge Network

Jul 09, 2021
Richard Lau, Lihan Yao, Todd Huster, William Johnson, Stephen Arleth, Justin Wong, Devin Ridge, Michael Fletcher, William C. Headley

Figure 1 for Scaled-Time-Attention Robust Edge Network
Figure 2 for Scaled-Time-Attention Robust Edge Network
Figure 3 for Scaled-Time-Attention Robust Edge Network
Figure 4 for Scaled-Time-Attention Robust Edge Network

This paper describes a systematic approach towards building a new family of neural networks based on a delay-loop version of a reservoir neural network. The resulting architecture, called Scaled-Time-Attention Robust Edge (STARE) network, exploits hyper dimensional space and non-multiply-and-add computation to achieve a simpler architecture, which has shallow layers, is simple to train, and is better suited for Edge applications, such as Internet of Things (IoT), over traditional deep neural networks. STARE incorporates new AI concepts such as Attention and Context, and is best suited for temporal feature extraction and classification. We demonstrate that STARE is applicable to a variety of applications with improved performance and lower implementation complexity. In particular, we showed a novel way of applying a dual-loop configuration to detection and identification of drone vs bird in a counter Unmanned Air Systems (UAS) detection application by exploiting both spatial (video frame) and temporal (trajectory) information. We also demonstrated that the STARE performance approaches that of a State-of-the-Art deep neural network in classifying RF modulations, and outperforms Long Short-term Memory (LSTM) in a special case of Mackey Glass time series prediction. To demonstrate hardware efficiency, we designed and developed an FPGA implementation of the STARE algorithm to demonstrate its low-power and high-throughput operations. In addition, we illustrate an efficient structure for integrating a massively parallel implementation of the STARE algorithm for ASIC implementation.

* 20 pages, 22 figures, 9 tables, Darpa Distribution Statement A. Approved for public release. Distribution Unlimited 
Viaarxiv icon

A Meta Learning Approach to Discerning Causal Graph Structure

Jun 06, 2021
Justin Wong, Dominik Damjakob

Figure 1 for A Meta Learning Approach to Discerning Causal Graph Structure
Figure 2 for A Meta Learning Approach to Discerning Causal Graph Structure
Figure 3 for A Meta Learning Approach to Discerning Causal Graph Structure
Figure 4 for A Meta Learning Approach to Discerning Causal Graph Structure

We explore the usage of meta-learning to derive the causal direction between variables by optimizing over a measure of distribution simplicity. We incorporate a stochastic graph representation which includes latent variables and allows for more generalizability and graph structure expression. Our model is able to learn causal direction indicators for complex graph structures despite effects of latent confounders. Further, we explore robustness of our method with respect to violations of our distributional assumptions and data scarcity. Our model is particularly robust to modest data scarcity, but is less robust to distributional changes. By interpreting the model predictions as stochastic events, we propose a simple ensemble method classifier to reduce the outcome variability as an average of biased events. This methodology demonstrates ability to infer the existence as well as the direction of a causal relationship between data distributions.

Viaarxiv icon

Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks

Apr 10, 2020
Jianan Yao, Gabriel Ryan, Justin Wong, Suman Jana, Ronghui Gu

Figure 1 for Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks
Figure 2 for Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks
Figure 3 for Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks
Figure 4 for Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks

Verifying real-world programs often requires inferring loop invariants with nonlinear constraints. This is especially true in programs that perform many numerical operations, such as control systems for avionics or industrial plants. Recently, data-driven methods for loop invariant inference have shown promise, especially on linear invariants. However, applying data-driven inference to nonlinear loop invariants is challenging due to the large numbers of and magnitudes of high-order terms, the potential for overfitting on a small number of samples, and the large space of possible inequality bounds. In this paper, we introduce a new neural architecture for general SMT learning, the Gated Continuous Logic Network (G-CLN), and apply it to nonlinear loop invariant learning. G-CLNs extend the Continuous Logic Network (CLN) architecture with gating units and dropout, which allow the model to robustly learn general invariants over large numbers of terms. To address overfitting that arises from finite program sampling, we introduce fractional sampling---a sound relaxation of loop semantics to continuous functions that facilitates unbounded sampling on real domain. We additionally design a new CLN activation function, the Piecewise Biased Quadratic Unit (PBQU), for naturally learning tight inequality bounds. We incorporate these methods into a nonlinear loop invariant inference system that can learn general nonlinear loop invariants. We evaluate our system on a benchmark of nonlinear loop invariants and show it solves 26 out of 27 problems, 3 more than prior work, with an average runtime of 53.3 seconds. We further demonstrate the generic learning ability of G-CLNs by solving all 124 problems in the linear Code2Inv benchmark. We also perform a quantitative stability evaluation and show G-CLNs have a convergence rate of $97.5\%$ on quadratic problems, a $39.2\%$ improvement over CLN models.

Viaarxiv icon

CLN2INV: Learning Loop Invariants with Continuous Logic Networks

Oct 17, 2019
Gabriel Ryan, Justin Wong, Jianan Yao, Ronghui Gu, Suman Jana

Figure 1 for CLN2INV: Learning Loop Invariants with Continuous Logic Networks
Figure 2 for CLN2INV: Learning Loop Invariants with Continuous Logic Networks
Figure 3 for CLN2INV: Learning Loop Invariants with Continuous Logic Networks
Figure 4 for CLN2INV: Learning Loop Invariants with Continuous Logic Networks

Program verification offers a framework for ensuring program correctness and therefore systematically eliminating different classes of bugs. Inferring loop invariants is one of the main challenges behind automated verification of real-world programs which often contain many loops. In this paper, we present Continuous Logic Network (CLN), a novel neural architecture for automatically learning loop invariants directly from program execution traces. Unlike existing neural networks, CLNs can learn precise and explicit representations of formulas in Satisfiability Modulo Theories (SMT) for loop invariants from program execution traces. We develop a new sound and complete semantic mapping for assigning SMT formulas to continuous truth values that allows CLNs to be trained efficiently. We use CLNs to implement a new inference system for loop invariants, CLN2INV, that significantly outperforms existing approaches on the popular Code2Inv dataset. CLN2INV is the first tool to solve all 124 theoretically solvable problems in the Code2Inv dataset. Moreover, CLN2INV takes only 1.1 seconds on average for each problem, which is 40 times faster than existing approaches. We further demonstrate that CLN2INV can even learn 12 significantly more complex loop invariants than the ones required for the Code2Inv dataset.

Viaarxiv icon