Large language models have demonstrated the ability to condition on and generate both natural language and programming language text. Such models open up the possibility of multi-language code generation: could code generation models generalize knowledge from one language to another? Although contemporary code generation models can generate semantically correct Python code, little is known about their abilities with other languages. We facilitate the exploration of this topic by proposing MultiPL-E, the first multi-language parallel benchmark for natural-language-to-code-generation. MultiPL-E extends the HumanEval benchmark (Chen et al, 2021) to support 18 more programming languages, encompassing a range of programming paradigms and popularity. We evaluate two state-of-the-art code generation models on MultiPL-E: Codex and InCoder. We find that on several languages, Codex matches and even exceeds its performance on Python. The range of programming languages represented in MultiPL-E allow us to explore the impact of language frequency and language features on model performance. Finally, the MultiPL-E approach of compiling code generation benchmarks to new programming languages is both scalable and extensible. We describe a general approach for easily adding support for new benchmarks and languages to MultiPL-E.
Robot social navigation is influenced by human preferences and environment-specific scenarios such as elevators and doors, thus necessitating end-user adaptability. State-of-the-art approaches to social navigation fall into two categories: model-based social constraints and learning-based approaches. While effective, these approaches have fundamental limitations -- model-based approaches require constraint and parameter tuning to adapt to preferences and new scenarios, while learning-based approaches require reward functions, significant training data, and are hard to adapt to new social scenarios or new domains with limited demonstrations. In this work, we propose Iterative Dimension Informed Program Synthesis (IDIPS) to address these limitations by learning and adapting social navigation in the form of human-readable symbolic programs. IDIPS works by combining program synthesis, parameter optimization, predicate repair, and iterative human demonstration to learn and adapt model-free action selection policies from orders of magnitude less data than learning-based approaches. We introduce a novel predicate repair technique that can accommodate previously unseen social scenarios or preferences by growing existing policies. We present experimental results showing that IDIPS: 1) synthesizes effective policies that model user preference, 2) can adapt existing policies to changing preferences, 3) can extend policies to handle novel social scenarios such as locked doors, and 4) generates policies that can be transferred from simulation to real-world robots with minimal effort.
Representation learning is a fundamental task in machine learning. It consists of learning the features of data items automatically, typically using a deep neural network (DNN), instead of selecting hand-engineered features that typically have worse performance. Graph data requires specific algorithms for representation learning such as DeepWalk, node2vec, and GraphSAGE. These algorithms first sample the input graph and then train a DNN based on the samples. It is common to use GPUs for training, but graph sampling on GPUs is challenging. Sampling is an embarrassingly parallel task since each sample can be generated independently. However, the irregularity of graphs makes it hard to use GPU resources effectively. Existing graph processing, mining, and representation learning systems do not effectively parallelize sampling and this negatively impacts the end-to-end performance of representation learning. In this paper, we present NextDoor, the first system specifically designed to perform graph sampling on GPUs. NextDoor introduces a high-level API based on a novel paradigm for parallel graph sampling called transit-parallelism. We implement several graph sampling applications, and show that NextDoor runs them orders of magnitude faster than existing systems
Representation learning is a fundamental task in machine learning. It consists of learning the features of data items automatically, typically using a deep neural network (DNN), instead of selecting hand-engineered features that typically have worse performance. Graph data requires specific algorithms for representation learning such as DeepWalk, node2vec, and GraphSAGE. These algorithms first sample the input graph and then train a DNN based on the samples. It is common to use GPUs for training, but graph sampling on GPUs is challenging. Sampling is an embarrassingly parallel task since each sample can be generated independently. However, the irregularity of graphs makes it hard to use GPU resources effectively. Existing graph processing, mining, and representation learning systems do not effectively parallelize sampling and this negatively impacts the end-to-end performance of representation learning. In this paper, we present NextDoor, the first system specifically designed to perform graph sampling on GPUs. NextDoor introduces a high-level API based on a novel paradigm for parallel graph sampling called transit-parallelism. We implement several graph sampling applications, and show that NextDoor runs them orders of magnitude faster than existing systems
Action selection policies (ASPs), used to compose low-level robot skills into complex high-level tasks are commonly represented as neural networks (NNs) in the state of the art. Such a paradigm, while very effective, suffers from a few key problems: 1) NNs are opaque to the user and hence not amenable to verification, 2) they require significant amounts of training data, and 3) they are hard to repair when the domain changes. We present two key insights about ASPs for robotics. First, ASPs need to reason about physically meaningful quantities derived from the state of the world, and second, there exists a layered structure for composing these policies. Leveraging these insights, we introduce layered dimension-informed program synthesis (LDIPS) - by reasoning about the physical dimensions of state variables, and dimensional constraints on operators, LDIPS directly synthesizes ASPs in a human-interpretable domain-specific language that is amenable to program repair. We present empirical results to demonstrate that LDIPS 1) can synthesize effective ASPs for robot soccer and autonomous driving domains, 2) requires two orders of magnitude fewer training examples than a comparable NN representation, and 3) can repair the synthesized ASPs with only a small number of corrections when transferring from simulation to real robots.
State machines are a common model for robot behaviors. Transition functions often rely on parameterized conditions to model preconditions for the controllers, where the correct values of the parameters depend on factors relating to the environment or the specific robot. In the absence of specific calibration procedures a roboticist must painstakingly adjust the parameters through a series of trial and error experiments. In this process, identifying when the robot has taken an incorrect action, and what should be done is straightforward, but finding the right parameter values can be difficult. We present an alternative approach that we call, interactive SMT-based Robot Transition Repair. During execution we record an execution trace of the transition function, and we ask the roboticist to identify a few instances where the robot has transitioned incorrectly, and what the correct transition should have been. A user supplies these corrections based on the type of error to repair, and an automated analysis of the traces partially evaluates the transition function for each correction. This system of constraints is then formulated as a MaxSMT problem, where the solution is a minimal adjustment to the parameters that satisfies the maximum number of constraints. In order to identify a repair that accurately captures user intentions and generalizes to novel scenarios, solutions are explored by iteratively adding constraints to the MaxSMT problem to yield sets of alternative repairs. We test with state machines from multiple domains including robot soccer and autonomous driving, and we evaluate solver based repair with respect to solver choice and optimization hyperparameters. Our results demonstrate that SRTR can repair a variety of states machines and error types 1) quickly, 2) with small numbers of corrections, while 3) not overcorrecting state machines and harming generalized performance.
Robots are a popular platform for introducing computing and artificial intelligence to novice programmers. However, programming state-of-the-art robots is very challenging, and requires knowledge of concurrency, operation safety, and software engineering skills, which can take years to teach. In this paper, we present an approach to introducing computing that allows students to safely and easily program high-performance robots. We develop a platform for students to program RoboCup Small Size League robots using JavaScript. The platform 1) ensures physical safety at several levels of abstraction, 2) allows students to program robots using the JavaScript in the browser, without the need to install software, and 3) presents a simplified JavaScript semantics that shields students from confusing language features. We discuss our experience running a week-long workshop using this platform, and analyze over 3,000 student-written program revisions to provide empirical evidence that our approach does help students.
For service mobile robots to be most effective, it must be possible for non-experts and even end-users to program them to do new tasks. Regardless of the programming method (e.g., by demonstration or traditional programming), robot task programs are challenging to write, because they rely on multiple actions to succeed, including human-robot interactions. Unfortunately, interactions are prone to fail, because a human may perform the wrong action (e.g., if the robot's request is not clear). Moreover, when the robot cannot directly observe the human action, it may not detect the failure until several steps after it occurs. Therefore, writing fault-tolerant robot tasks is beyond the ability of non-experts. This paper presents a principled approach to detect and recover from a broad class of failures that occur in end-user programs on service mobile robots. We present a two-tiered Robot Task Programming Language (RTPL): 1) an expert roboticist uses a specification language to write a probabilistic model of the robot's actions and interactions, and 2) a non-expert then writes an ordinary sequential program for a particular task. The RTPL runtime system executes the task program sequentially, while using the probabilistic model to build a Bayesian network that tracks possible, unobserved failures. If an error is observed, RTPL uses Bayesian inference to find the likely root cause of the error, and then attempts to re-execute a portion of the program for recovery. Our empirical results show that RTPL 1) allows complex tasks to be written concisely, 2) correctly identifies the root cause of failure, and 3) allows multiple tasks to recover from a variety of errors, without task-specific error-recovery code.
Complex robot behaviors are often structured as state machines, where states encapsulate actions and a transition function switches between states. Since transitions depend on physical parameters, when the environment changes, a roboticist has to painstakingly readjust the parameters to work in the new environment. We present interactive SMT-based Robot Transition Repair (SRTR): instead of manually adjusting parameters, we ask the roboticist to identify a few instances where the robot is in a wrong state and what the right state should be. A lightweight automated analysis of the transition function's source code then 1) identifies adjustable parameters, 2) converts the transition function into a system of logical constraints, and 3) formulates the constraints and user-supplied corrections as MaxSMT problem that yields new parameter values. Our evaluation shows that SRTR is effective on real robots and in simulation. We show that SRTR finds new parameters 1) quickly, 2) with only a few corrections, and 3) that the parameters generalize to new scenarios. We also show that a simple state machine corrected by SRTR can out-perform a more complex, expert-tuned state machine in the real world.