Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Koushil Sreenath

Humanoid Locomotion as Next Token Prediction

Feb 29, 2024

Ilija Radosavovic, Bike Zhang, Baifeng Shi, Jathushan Rajasegaran, Sarthak Kamat, Trevor Darrell, Koushil Sreenath, Jitendra Malik

Figure 1 for Humanoid Locomotion as Next Token Prediction

Figure 2 for Humanoid Locomotion as Next Token Prediction

Figure 3 for Humanoid Locomotion as Next Token Prediction

Figure 4 for Humanoid Locomotion as Next Token Prediction

Abstract:We cast real-world humanoid control as a next token prediction problem, akin to predicting the next word in language. Our model is a causal transformer trained via autoregressive prediction of sensorimotor trajectories. To account for the multi-modal nature of the data, we perform prediction in a modality-aligned way, and for each input token predict the next token from the same modality. This general formulation enables us to leverage data with missing modalities, like video trajectories without actions. We train our model on a collection of simulated trajectories coming from prior neural network policies, model-based controllers, motion capture data, and YouTube videos of humans. We show that our model enables a full-sized humanoid to walk in San Francisco zero-shot. Our model can transfer to the real world even when trained on only 27 hours of walking data, and can generalize to commands not seen during training like walking backward. These findings suggest a promising path toward learning challenging real-world control tasks by generative modeling of sensorimotor trajectories.

Via

Access Paper or Ask Questions

Safety Filters for Black-Box Dynamical Systems by Learning Discriminating Hyperplanes

Feb 07, 2024

Will Lavanakul, Jason J. Choi, Koushil Sreenath, Claire J. Tomlin

Figure 1 for Safety Filters for Black-Box Dynamical Systems by Learning Discriminating Hyperplanes

Figure 2 for Safety Filters for Black-Box Dynamical Systems by Learning Discriminating Hyperplanes

Figure 3 for Safety Filters for Black-Box Dynamical Systems by Learning Discriminating Hyperplanes

Figure 4 for Safety Filters for Black-Box Dynamical Systems by Learning Discriminating Hyperplanes

Abstract:Learning-based approaches are emerging as an effective approach for safety filters for black-box dynamical systems. Existing methods have relied on certificate functions like Control Barrier Functions (CBFs) and Hamilton-Jacobi (HJ) reachability value functions. The primary motivation for our work is the recognition that ultimately, enforcing the safety constraint as a control input constraint at each state is what matters. By focusing on this constraint, we can eliminate dependence on any specific certificate function-based design. To achieve this, we define a discriminating hyperplane that shapes the half-space constraint on control input at each state, serving as a sufficient condition for safety. This concept not only generalizes over traditional safety methods but also simplifies safety filter design by eliminating dependence on specific certificate functions. We present two strategies to learn the discriminating hyperplane: (a) a supervised learning approach, using pre-verified control invariant sets for labeling, and (b) a reinforcement learning (RL) approach, which does not require such labels. The main advantage of our method, unlike conventional safe RL approaches, is the separation of performance and safety. This offers a reusable safety filter for learning new tasks, avoiding the need to retrain from scratch. As such, we believe that the new notion of the discriminating hyperplane offers a more generalizable direction towards designing safety filters, encompassing and extending existing certificate-function-based or safe RL methodologies.

* * indicate co-first authors. This is an extended version of the paper submitted to L4DC 2024

Via

Access Paper or Ask Questions

Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

Jan 30, 2024

Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Abstract:This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. Our RL-based controller incorporates a novel dual-history architecture, utilizing both a long-term and short-term input/output (I/O) history of the robot. This control architecture, when trained through the proposed end-to-end RL approach, consistently outperforms other methods across a diverse range of skills in both simulation and the real world.The study also delves into the adaptivity and robustness introduced by the proposed RL system in developing locomotion controllers. We demonstrate that the proposed architecture can adapt to both time-invariant dynamics shifts and time-variant changes, such as contact events, by effectively using the robot's I/O history. Additionally, we identify task randomization as another key source of robustness, fostering better task generalization and compliance to disturbances. The resulting control policies can be successfully deployed on Cassie, a torque-controlled human-sized bipedal robot. This work pushes the limits of agility for bipedal robots through extensive real-world experiments. We demonstrate a diverse range of locomotion skills, including: robust standing, versatile walking, fast running with a demonstration of a 400-meter dash, and a diverse set of jumping skills, such as standing long jumps and high jumps.

Via

Access Paper or Ask Questions

Constraint-Guided Online Data Selection for Scalable Data-Driven Safety Filters in Uncertain Robotic Systems

Nov 23, 2023

Jason J. Choi, Fernando Castañeda, Wonsuhk Jung, Bike Zhang, Claire J. Tomlin, Koushil Sreenath

Abstract:As the use of autonomous robotic systems expands in tasks that are complex and challenging to model, the demand for robust data-driven control methods that can certify safety and stability in uncertain conditions is increasing. However, the practical implementation of these methods often faces scalability issues due to the growing amount of data points with system complexity, and a significant reliance on high-quality training data. In response to these challenges, this study presents a scalable data-driven controller that efficiently identifies and infers from the most informative data points for implementing data-driven safety filters. Our approach is grounded in the integration of a model-based certificate function-based method and Gaussian Process (GP) regression, reinforced by a novel online data selection algorithm that reduces time complexity from quadratic to linear relative to dataset size. Empirical evidence, gathered from successful real-world cart-pole swing-up experiments and simulated locomotion of a five-link bipedal robot, demonstrates the efficacy of our approach. Our findings reveal that our efficient online data selection algorithm, which strategically selects key data points, enhances the practicality and efficiency of data-driven certifying filters in complex robotic systems, significantly mitigating scalability concerns inherent in nonparametric learning-based control methods.

* The first three authors contributed equally to the work. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Prompt a Robot to Walk with Large Language Models

Sep 18, 2023

Yen-Jen Wang, Bike Zhang, Jianyu Chen, Koushil Sreenath

Figure 1 for Prompt a Robot to Walk with Large Language Models

Figure 2 for Prompt a Robot to Walk with Large Language Models

Figure 3 for Prompt a Robot to Walk with Large Language Models

Figure 4 for Prompt a Robot to Walk with Large Language Models

Abstract:Large language models (LLMs) pre-trained on vast internet-scale data have showcased remarkable capabilities across diverse domains. Recently, there has been escalating interest in deploying LLMs for robotics, aiming to harness the power of foundation models in real-world settings. However, this approach faces significant challenges, particularly in grounding these models in the physical world and in generating dynamic robot motions. To address these issues, we introduce a novel paradigm in which we use few-shot prompts collected from the physical environment, enabling the LLM to autoregressively generate low-level control commands for robots without task-specific fine-tuning. Experiments across various robots and environments validate that our method can effectively prompt a robot to walk. We thus illustrate how LLMs can proficiently function as low-level feedback controllers for dynamic motion control even in high-dimensional robotic systems. The project website and source code can be found at: https://prompt2walk.github.io/ .

Via

Access Paper or Ask Questions

Nonsmooth Control Barrier Functions for Obstacle Avoidance between Convex Regions

Jun 23, 2023

Akshay Thirugnanam, Jun Zeng, Koushil Sreenath

Figure 1 for Nonsmooth Control Barrier Functions for Obstacle Avoidance between Convex Regions

Figure 2 for Nonsmooth Control Barrier Functions for Obstacle Avoidance between Convex Regions

Figure 3 for Nonsmooth Control Barrier Functions for Obstacle Avoidance between Convex Regions

Figure 4 for Nonsmooth Control Barrier Functions for Obstacle Avoidance between Convex Regions

Abstract:In this paper, we focus on non-conservative obstacle avoidance between robots with control affine dynamics with strictly convex and polytopic shapes. The core challenge for this obstacle avoidance problem is that the minimum distance between strictly convex regions or polytopes is generally implicit and non-smooth, such that distance constraints cannot be enforced directly in the optimization problem. To handle this challenge, we employ non-smooth control barrier functions to reformulate the avoidance problem in the dual space, with the positivity of the minimum distance between robots equivalently expressed using a quadratic program. Our approach is proven to guarantee system safety. We theoretically analyze the smoothness properties of the minimum distance quadratic program and its KKT conditions. We validate our approach by demonstrating computationally-efficient obstacle avoidance for multi-agent robotic systems with strictly convex and polytopic shapes. To our best knowledge, this is the first time a real-time QP problem can be formulated for general non-conservative avoidance between strictly convex shapes and polytopes.

* 17 pages

Via

Access Paper or Ask Questions

Velocity Obstacle for Polytopic Collision Avoidance for Distributed Multi-robot Systems

Apr 17, 2023

Jihao Huang, Jun Zeng, Xuemin Chi, Koushil Sreenath, Zhitao Liu, Hongye Su

Figure 1 for Velocity Obstacle for Polytopic Collision Avoidance for Distributed Multi-robot Systems

Figure 2 for Velocity Obstacle for Polytopic Collision Avoidance for Distributed Multi-robot Systems

Figure 3 for Velocity Obstacle for Polytopic Collision Avoidance for Distributed Multi-robot Systems

Figure 4 for Velocity Obstacle for Polytopic Collision Avoidance for Distributed Multi-robot Systems

Abstract:Obstacle avoidance for multi-robot navigation with polytopic shapes is challenging. Existing works simplify the system dynamics or consider it as a convex or non-convex optimization problem with positive distance constraints between robots, which limits real-time performance and scalability. Additionally, generating collision-free behavior for polytopic-shaped robots is harder due to implicit and non-differentiable distance functions between polytopes. In this paper, we extend the concept of velocity obstacle (VO) principle for polytopic-shaped robots and propose a novel approach to construct the VO in the function of vertex coordinates and other robot's states. Compared with existing work about obstacle avoidance between polytopic-shaped robots, our approach is much more computationally efficient as the proposed approach for construction of VO between polytopes is optimization-free. Based on VO representation for polytopic shapes, we later propose a navigation approach for distributed multi-robot systems. We validate our proposed VO representation and navigation approach in multiple challenging scenarios including large-scale randomized tests, and our approach outperforms the state of art in many evaluation metrics, including completion rate, deadlock rate, and the average travel distance.

* Accepted to IEEE Robotics and Automation Letters (RA-L) 2023

Via

Access Paper or Ask Questions

i2LQR: Iterative LQR for Iterative Tasks in Dynamic Environments

Mar 17, 2023

Yifan Zeng, Suiyi He, Han Hoang Nguyen, Zhongyu Li, Koushil Sreenath, Jun Zeng

Figure 1 for i2LQR: Iterative LQR for Iterative Tasks in Dynamic Environments

Figure 2 for i2LQR: Iterative LQR for Iterative Tasks in Dynamic Environments

Figure 3 for i2LQR: Iterative LQR for Iterative Tasks in Dynamic Environments

Figure 4 for i2LQR: Iterative LQR for Iterative Tasks in Dynamic Environments

Abstract:This work introduces a novel control strategy called Iterative Linear Quadratic Regulator for Iterative Tasks (i2LQR), which aims to pursue optimal performance for iterative tasks in a dynamic environment. The proposed algorithm is reference-free and utilizes historical data from previous iterations to enhance the performance of the autonomous system. Unlike existing algorithms, the i2LQR computes the optimal solution in an iterative manner at each timestamp, rendering it well-suited for iterative tasks with changing constraints at different iterations. To evaluate the performance of the proposed algorithm, we conduct numerical simulations for an iterative task aimed at minimizing completion time. The results show that i2LQR achieves the optimal performance as the state-of-the-art algorithm in static environments, and outperforms the state-of-the-art algorithm in dynamic environments with both static and dynamics obstacles.

* Submitted to IEEE Control Systems Letters (L-CSS)

Via

Access Paper or Ask Questions

Learning Humanoid Locomotion with Transformers

Mar 06, 2023

Ilija Radosavovic, Tete Xiao, Bike Zhang, Trevor Darrell, Jitendra Malik, Koushil Sreenath

Abstract:We present a sim-to-real learning-based approach for real-world humanoid locomotion. Our controller is a causal Transformer trained by autoregressive prediction of future actions from the history of observations and actions. We hypothesize that the observation-action history contains useful information about the world that a powerful Transformer model can use to adapt its behavior in-context, without updating its weights. We do not use state estimation, dynamics models, trajectory optimization, reference trajectories, or pre-computed gait libraries. Our controller is trained with large-scale model-free reinforcement learning on an ensemble of randomized environments in simulation and deployed to the real world in a zero-shot fashion. We evaluate our approach in high-fidelity simulation and successfully deploy it to the real robot as well. To the best of our knowledge, this is the first demonstration of a fully learning-based method for real-world full-sized humanoid locomotion.

* Project page: https://humanoid-transformer.github.io

Via

Access Paper or Ask Questions

Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning

Feb 19, 2023

Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Figure 1 for Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning

Figure 2 for Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning

Figure 3 for Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning

Figure 4 for Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning

Abstract:This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world. We present a multi-task reinforcement learning framework to train the robot to accomplish a large variety of jumping tasks, such as jumping to different locations and directions. To improve performance on these challenging tasks, we develop a new policy structure that encodes the robot's long-term input/output (I/O) history while also providing direct access to its short-term I/O history. In order to train a versatile multi-task policy, we utilize a multi-stage training scheme that includes different training stages for different objectives. After multi-stage training, the multi-task policy can be directly transferred to Cassie, a physical bipedal robot. Training on different tasks and exploring more diverse scenarios leads to highly robust policies that can exploit the diverse set of learned skills to recover from perturbations or poor landings during real-world deployment. Such robustness in the proposed multi-task policy enables Cassie to succeed in completing a variety of challenging jump tasks in the real world, such as standing long jumps, jumping onto elevated platforms, and multi-axis jumps.

* Accompanying video is at https://youtu.be/aAPSZ2QFB-E

Via

Access Paper or Ask Questions