Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shishir Kolathaya

V-OCBF: Learning Safety Filters from Offline Data via Value-Guided Offline Control Barrier Functions

Dec 11, 2025

Mumuksh Tayal, Manan Tayal, Aditya Singh, Shishir Kolathaya, Ravi Prakash

Abstract:Ensuring safety in autonomous systems requires controllers that satisfy hard, state-wise constraints without relying on online interaction. While existing Safe Offline RL methods typically enforce soft expected-cost constraints, they do not guarantee forward invariance. Conversely, Control Barrier Functions (CBFs) provide rigorous safety guarantees but usually depend on expert-designed barrier functions or full knowledge of the system dynamics. We introduce Value-Guided Offline Control Barrier Functions (V-OCBF), a framework that learns a neural CBF entirely from offline demonstrations. Unlike prior approaches, V-OCBF does not assume access to the dynamics model; instead, it derives a recursive finite-difference barrier update, enabling model-free learning of a barrier that propagates safety information over time. Moreover, V-OCBF incorporates an expectile-based objective that avoids querying the barrier on out-of-distribution actions and restricts updates to the dataset-supported action set. The learned barrier is then used with a Quadratic Program (QP) formulation to synthesize real-time safe control. Across multiple case studies, V-OCBF yields substantially fewer safety violations than baseline methods while maintaining strong task performance, highlighting its scalability for offline synthesis of safety-critical controllers without online interaction or hand-engineered barriers.

* 23 pages, 8 figure, 7 tables

Via

Access Paper or Ask Questions

Real-Time Gait Adaptation for Quadrupeds using Model Predictive Control and Reinforcement Learning

Oct 23, 2025

Ganga Nair B, Prakrut Kotecha, Shishir Kolathaya

Abstract:Model-free reinforcement learning (RL) has enabled adaptable and agile quadruped locomotion; however, policies often converge to a single gait, leading to suboptimal performance. Traditionally, Model Predictive Control (MPC) has been extensively used to obtain task-specific optimal policies but lacks the ability to adapt to varying environments. To address these limitations, we propose an optimization framework for real-time gait adaptation in a continuous gait space, combining the Model Predictive Path Integral (MPPI) algorithm with a Dreamer module to produce adaptive and optimal policies for quadruped locomotion. At each time step, MPPI jointly optimizes the actions and gait variables using a learned Dreamer reward that promotes velocity tracking, energy efficiency, stability, and smooth transitions, while penalizing abrupt gait changes. A learned value function is incorporated as terminal reward, extending the formulation to an infinite-horizon planner. We evaluate our framework in simulation on the Unitree Go1, demonstrating an average reduction of up to 36.48\% in energy consumption across varying target speeds, while maintaining accurate tracking and adaptive, task-appropriate gaits.

Via

Access Paper or Ask Questions

COMPAct: Computational Optimization and Automated Modular design of Planetary Actuators

Oct 08, 2025

Aman Singh, Deepak Kapa, Suryank Joshi, Shishir Kolathaya

Figure 1 for COMPAct: Computational Optimization and Automated Modular design of Planetary Actuators

Figure 2 for COMPAct: Computational Optimization and Automated Modular design of Planetary Actuators

Figure 3 for COMPAct: Computational Optimization and Automated Modular design of Planetary Actuators

Figure 4 for COMPAct: Computational Optimization and Automated Modular design of Planetary Actuators

Abstract:The optimal design of robotic actuators is a critical area of research, yet limited attention has been given to optimizing gearbox parameters and automating actuator CAD. This paper introduces COMPAct: Computational Optimization and Automated Modular Design of Planetary Actuators, a framework that systematically identifies optimal gearbox parameters for a given motor across four gearbox types, single-stage planetary gearbox (SSPG), compound planetary gearbox (CPG), Wolfrom planetary gearbox (WPG), and double-stage planetary gearbox (DSPG). The framework minimizes mass and actuator width while maximizing efficiency, and further automates actuator CAD generation to enable direct 3D printing without manual redesign. Using this framework, optimal gearbox designs are explored over a wide range of gear ratios, providing insights into the suitability of different gearbox types across various gear ratio ranges. In addition, the framework is used to generate CAD models of all four gearbox types with varying gear ratios and motors. Two actuator types are fabricated and experimentally evaluated through power efficiency, no-load backlash, and transmission stiffness tests. Experimental results indicate that the SSPG actuator achieves a mechanical efficiency of 60-80 %, a no-load backlash of 0.59 deg, and a transmission stiffness of 242.7 Nm/rad, while the CPG actuator demonstrates 60 % efficiency, 2.6 deg backlash, and a stiffness of 201.6 Nm/rad. Code available at: https://anonymous.4open.science/r/COMPAct-SubNum-3408 Video: https://youtu.be/99zOKgxsDho

* 8 pages, 9 Figures, 2 tables, first two authors contributed equally

Via

Access Paper or Ask Questions

Signal Temporal Logic Compliant Co-design of Planning and Control

Jul 17, 2025

Manas Sashank Juvvi, Tushar Dilip Kurne, Vaishnavi J, Shishir Kolathaya, Pushpak Jagtap

Figure 1 for Signal Temporal Logic Compliant Co-design of Planning and Control

Figure 2 for Signal Temporal Logic Compliant Co-design of Planning and Control

Figure 3 for Signal Temporal Logic Compliant Co-design of Planning and Control

Abstract:This work presents a novel co-design strategy that integrates trajectory planning and control to handle STL-based tasks in autonomous robots. The method consists of two phases: $(i)$ learning spatio-temporal motion primitives to encapsulate the inherent robot-specific constraints and $(ii)$ constructing an STL-compliant motion plan from these primitives. Initially, we employ reinforcement learning to construct a library of control policies that perform trajectories described by the motion primitives. Then, we map motion primitives to spatio-temporal characteristics. Subsequently, we present a sampling-based STL-compliant motion planning strategy tailored to meet the STL specification. The proposed model-free approach, which generates feasible STL-compliant motion plans across various environments, is validated on differential-drive and quadruped robots across various STL specifications. Demonstration videos are available at https://tinyurl.com/m6zp7rsm.

Via

Access Paper or Ask Questions

GROQLoco: Generalist and RObot-agnostic Quadruped Locomotion Control using Offline Datasets

May 16, 2025

Narayanan PP, Sarvesh Prasanth Venkatesan, Srinivas Kantha Reddy, Shishir Kolathaya

Abstract:Recent advancements in large-scale offline training have demonstrated the potential of generalist policy learning for complex robotic tasks. However, applying these principles to legged locomotion remains a challenge due to continuous dynamics and the need for real-time adaptation across diverse terrains and robot morphologies. In this work, we propose GROQLoco, a scalable, attention-based framework that learns a single generalist locomotion policy across multiple quadruped robots and terrains, relying solely on offline datasets. Our approach leverages expert demonstrations from two distinct locomotion behaviors - stair traversal (non-periodic gaits) and flat terrain traversal (periodic gaits) - collected across multiple quadruped robots, to train a generalist model that enables behavior fusion for both behaviors. Crucially, our framework operates directly on proprioceptive data from all robots without incorporating any robot-specific encodings. The policy is directly deployable on an Intel i7 nuc, producing low-latency control outputs without any test-time optimization. Our extensive experiments demonstrate strong zero-shot transfer across highly diverse quadruped robots and terrains, including hardware deployment on the Unitree Go1, a commercially available 12kg robot. Notably, we evaluate challenging cross-robot training setups where different locomotion skills are unevenly distributed across robots, yet observe successful transfer of both flat walking and stair traversal behaviors to all robots at test time. We also show preliminary walking on Stoch 5, a 70kg quadruped, on flat and outdoor terrains without requiring any fine tuning. These results highlight the potential for robust generalist locomotion across diverse robots and terrains.

* 18pages, 16figures, 6tables

Via

Access Paper or Ask Questions

MULE: Multi-terrain and Unknown Load Adaptation for Effective Quadrupedal Locomotion

May 01, 2025

Vamshi Kumar Kurva, Shishir Kolathaya

Abstract:Quadrupedal robots are increasingly deployed for load-carrying tasks across diverse terrains. While Model Predictive Control (MPC)-based methods can account for payload variations, they often depend on predefined gait schedules or trajectory generators, limiting their adaptability in unstructured environments. To address these limitations, we propose an Adaptive Reinforcement Learning (RL) framework that enables quadrupedal robots to dynamically adapt to both varying payloads and diverse terrains. The framework consists of a nominal policy responsible for baseline locomotion and an adaptive policy that learns corrective actions to preserve stability and improve command tracking under payload variations. We validate the proposed approach through large-scale simulation experiments in Isaac Gym and real-world hardware deployment on a Unitree Go1 quadruped. The controller was tested on flat ground, slopes, and stairs under both static and dynamic payload changes. Across all settings, our adaptive controller consistently outperformed the controller in tracking body height and velocity commands, demonstrating enhanced robustness and adaptability without requiring explicit gait design or manual tuning.

* Preprint under review

Via

Access Paper or Ask Questions

Neural Control Barrier Functions from Physics Informed Neural Networks

Apr 15, 2025

Shreenabh Agrawal, Manan Tayal, Aditya Singh, Shishir Kolathaya

Abstract:As autonomous systems become increasingly prevalent in daily life, ensuring their safety is paramount. Control Barrier Functions (CBFs) have emerged as an effective tool for guaranteeing safety; however, manually designing them for specific applications remains a significant challenge. With the advent of deep learning techniques, recent research has explored synthesizing CBFs using neural networks-commonly referred to as neural CBFs. This paper introduces a novel class of neural CBFs that leverages a physics-inspired neural network framework by incorporating Zubov's Partial Differential Equation (PDE) within the context of safety. This approach provides a scalable methodology for synthesizing neural CBFs applicable to high-dimensional systems. Furthermore, by utilizing reciprocal CBFs instead of zeroing CBFs, the proposed framework allows for the specification of flexible, user-defined safe regions. To validate the effectiveness of the approach, we present case studies on three different systems: an inverted pendulum, autonomous ground navigation, and aerial navigation in obstacle-laden environments.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions

Discrete time model predictive control for humanoid walking with step adjustment

Oct 09, 2024

Vishnu Joshi, Suraj Kumar, Nithin V, Shishir Kolathaya

Figure 1 for Discrete time model predictive control for humanoid walking with step adjustment

Figure 2 for Discrete time model predictive control for humanoid walking with step adjustment

Figure 3 for Discrete time model predictive control for humanoid walking with step adjustment

Figure 4 for Discrete time model predictive control for humanoid walking with step adjustment

Abstract:This paper presents a Discrete-Time Model Predictive Controller (MPC) for humanoid walking with online footstep adjustment. The proposed controller utilizes a hierarchical control approach. The high-level controller uses a low-dimensional Linear Inverted Pendulum Model (LIPM) to determine desired foot placement and Center of Mass (CoM) motion, to prevent falls while maintaining the desired velocity. A Task Space Controller (TSC) then tracks the desired motion obtained from the high-level controller, exploiting the whole-body dynamics of the humanoid. Our approach differs from existing MPC methods for walking pattern generation by not relying on a predefined foot-plan or a reference center of pressure (CoP) trajectory. The overall approach is tested in simulation on a torque-controlled Humanoid Robot. Results show that proposed control approach generates stable walking and prevents fall against push disturbances.

* 6 pages, 17 figures, 1 table

Via

Access Paper or Ask Questions

Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Sep 26, 2024

Kartik Garg, Sai Shubodh Puligilla, Shishir Kolathaya, Madhava Krishna, Sourav Garg

Figure 1 for Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Figure 2 for Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Figure 3 for Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Figure 4 for Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Abstract:Accurately recognizing a revisited place is crucial for embodied agents to localize and navigate. This requires visual representations to be distinct, despite strong variations in camera viewpoint and scene appearance. Existing visual place recognition pipelines encode the "whole" image and search for matches. This poses a fundamental challenge in matching two images of the same place captured from different camera viewpoints: "the similarity of what overlaps can be dominated by the dissimilarity of what does not overlap". We address this by encoding and searching for "image segments" instead of the whole images. We propose to use open-set image segmentation to decompose an image into `meaningful' entities (i.e., things and stuff). This enables us to create a novel image representation as a collection of multiple overlapping subgraphs connecting a segment with its neighboring segments, dubbed SuperSegment. Furthermore, to efficiently encode these SuperSegments into compact vector representations, we propose a novel factorized representation of feature aggregation. We show that retrieving these partial representations leads to significantly higher recognition recall than the typical whole image based retrieval. Our segments-based approach, dubbed SegVLAD, sets a new state-of-the-art in place recognition on a diverse selection of benchmark datasets, while being applicable to both generic and task-specialized image encoders. Finally, we demonstrate the potential of our method to ``revisit anything'' by evaluating our method on an object instance retrieval task, which bridges the two disparate areas of research: visual place recognition and object-goal navigation, through their common aim of recognizing goal objects specific to a place. Source code: https://github.com/AnyLoc/Revisit-Anything.

* Presented at ECCV 2024; Includes supplementary; 29 pages; 8 figures

Via

Access Paper or Ask Questions

PIP-Loco: A Proprioceptive Infinite Horizon Planning Framework for Quadrupedal Robot Locomotion

Sep 17, 2024

Aditya Shirwatkar, Naman Saxena, Kishore Chandra, Shishir Kolathaya

Abstract:A core strength of Model Predictive Control (MPC) for quadrupedal locomotion has been its ability to enforce constraints and provide interpretability of the sequence of commands over the horizon. However, despite being able to plan, MPC struggles to scale with task complexity, often failing to achieve robust behavior on rapidly changing surfaces. On the other hand, model-free Reinforcement Learning (RL) methods have outperformed MPC on multiple terrains, showing emergent motions but inherently lack any ability to handle constraints or perform planning. To address these limitations, we propose a framework that integrates proprioceptive planning with RL, allowing for agile and safe locomotion behaviors through the horizon. Inspired by MPC, we incorporate an internal model that includes a velocity estimator and a Dreamer module. During training, the framework learns an expert policy and an internal model that are co-dependent, facilitating exploration for improved locomotion behaviors. During deployment, the Dreamer module solves an infinite-horizon MPC problem, adapting actions and velocity commands to respect the constraints. We validate the robustness of our training framework through ablation studies on internal model components and demonstrate improved robustness to training noise. Finally, we evaluate our approach across multi-terrain scenarios in both simulation and hardware.

* Preprint under review

Via

Access Paper or Ask Questions