Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jonathan P. How

MIT

Look Before You Leap: Socially Acceptable High-Speed Ground Robot Navigation in Crowded Hallways

Mar 20, 2024

Lakshay Sharma, Jonathan P. How

Figure 1 for Look Before You Leap: Socially Acceptable High-Speed Ground Robot Navigation in Crowded Hallways

Figure 2 for Look Before You Leap: Socially Acceptable High-Speed Ground Robot Navigation in Crowded Hallways

Figure 3 for Look Before You Leap: Socially Acceptable High-Speed Ground Robot Navigation in Crowded Hallways

Figure 4 for Look Before You Leap: Socially Acceptable High-Speed Ground Robot Navigation in Crowded Hallways

Abstract:To operate safely and efficiently, autonomous warehouse/delivery robots must be able to accomplish tasks while navigating in dynamic environments and handling the large uncertainties associated with the motions/behaviors of other robots and/or humans. A key scenario in such environments is the hallway problem, where robots must operate in the same narrow corridor as human traffic going in one or both directions. Traditionally, robot planners have tended to focus on socially acceptable behavior in the hallway scenario at the expense of performance. This paper proposes a planner that aims to address the consequent "robot freezing problem" in hallways by allowing for "peek-and-pass" maneuvers. We then go on to demonstrate in simulation how this planner improves robot time to goal without violating social norms. Finally, we show initial hardware demonstrations of this planner in the real world.

* Submitted to IROS 2024

Via

Access Paper or Ask Questions

CLIPPER: Robust Data Association without an Initial Guess

Feb 11, 2024

Parker C. Lusk, Jonathan P. How

Figure 1 for CLIPPER: Robust Data Association without an Initial Guess

Figure 2 for CLIPPER: Robust Data Association without an Initial Guess

Figure 3 for CLIPPER: Robust Data Association without an Initial Guess

Figure 4 for CLIPPER: Robust Data Association without an Initial Guess

Abstract:Identifying correspondences in noisy data is a critically important step in estimation processes. When an informative initial estimation guess is available, the data association challenge is less acute; however, the existence of a high-quality initial guess is rare in most contexts. We explore graph-theoretic formulations for data association, which do not require an initial estimation guess. Existing graph-theoretic approaches optimize over unweighted graphs, discarding important consistency information encoded in weighted edges, and frequently attempt to solve NP-hard problems exactly. In contrast, we formulate a new optimization problem that fully leverages weighted graphs and seeks the densest edge-weighted clique. We introduce two relaxations to this problem: a convex semidefinite relaxation which we find to be empirically tight, and a fast first-order algorithm called CLIPPER which frequently arrives at nearly-optimal solutions in milliseconds. When evaluated on point cloud registration problems, our algorithms remain robust up to at least 95% outliers while existing algorithms begin breaking down at 80% outliers. Code is available at https://mit-acl.github.io/clipper.

* 8 pages, 4 figures, accepted to RA-L

Via

Access Paper or Ask Questions

SOS-SLAM: Segmentation for Open-Set SLAM in Unstructured Environments

Jan 09, 2024

Jouko Kinnari, Annika Thomas, Parker Lusk, Kota Kondo, Jonathan P. How

Figure 1 for SOS-SLAM: Segmentation for Open-Set SLAM in Unstructured Environments

Figure 2 for SOS-SLAM: Segmentation for Open-Set SLAM in Unstructured Environments

Figure 3 for SOS-SLAM: Segmentation for Open-Set SLAM in Unstructured Environments

Figure 4 for SOS-SLAM: Segmentation for Open-Set SLAM in Unstructured Environments

Abstract:We present a novel framework for open-set Simultaneous Localization and Mapping (SLAM) in unstructured environments that uses segmentation to create a map of objects and geometric relationships between objects for localization. Our system consists of 1) a front-end mapping pipeline using a zero-shot segmentation model to extract object masks from images and track them across frames to generate an object-based map and 2) a frame alignment pipeline that uses the geometric consistency of objects to efficiently localize within maps taken in a variety of conditions. This approach is shown to be more robust to changes in lighting and appearance than traditional feature-based SLAM systems or global descriptor methods. This is established by evaluating SOS-SLAM on the Batvik seasonal dataset which includes drone flights collected over a coastal plot of southern Finland during different seasons and lighting conditions. Across flights during varying environmental conditions, our approach achieves higher recall than benchmark methods with precision of 1.0. SOS-SLAM localizes within a reference map up to 14x faster than other feature based approaches and has a map size less than 0.4% the size of the most compact other maps. When considering localization performance from varying viewpoints, our approach outperforms all benchmarks from the same viewpoint and most benchmarks from different viewpoints. SOS-SLAM is a promising new approach for SLAM in unstructured environments that is robust to changes in lighting and appearance and is more computationally efficient than other approaches. We release our code and datasets: https://acl.mit.edu/SOS-SLAM/.

* 8 pages, 7 figures

Via

Access Paper or Ask Questions

MURP: Multi-Agent Ultra-Wideband Relative Pose Estimation with Constrained Communications in 3D Environments

Dec 29, 2023

Andrew Fishberg, Brian Quiter, Jonathan P. How

Abstract:Inter-agent relative localization is critical for many multi-robot systems operating in the absence of external positioning infrastructure or prior environmental knowledge. We propose a novel inter-agent relative 3D pose estimation system where each participating agent is equipped with several ultra-wideband (UWB) ranging tags. Prior work typically supplements noisy UWB range measurements with additional continuously transmitted data, such as odometry, leading to potential scaling issues with increased team size and/or decreased communication network capability. By equipping each agent with multiple UWB antennas, our approach addresses these concerns by using only locally collected UWB range measurements, a priori state constraints, and detections of when said constraints are violated. Leveraging our learned mean ranging bias correction, we gain a 19% positional error improvement giving us experimental mean absolute position and heading errors of 0.24m and 9.5 degrees respectively. When compared to other state-of-the-art approaches, our work demonstrates improved performance over similar systems, while remaining competitive with methods that have significantly higher communication costs. Additionally, we make our datasets available.

Via

Access Paper or Ask Questions

Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs

Nov 23, 2023

Andrea Tagliabue, Jonathan P. How

Figure 1 for Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs

Figure 2 for Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs

Figure 3 for Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs

Figure 4 for Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs

Abstract:Imitation learning (IL) can train computationally-efficient sensorimotor policies from a resource-intensive Model Predictive Controller (MPC), but it often requires many samples, leading to long training times or limited robustness. To address these issues, we combine IL with a variant of robust MPC that accounts for process and sensing uncertainties, and we design a data augmentation (DA) strategy that enables efficient learning of vision-based policies. The proposed DA method, named Tube-NeRF, leverages Neural Radiance Fields (NeRFs) to generate novel synthetic images, and uses properties of the robust MPC (the tube) to select relevant views and to efficiently compute the corresponding actions. We tailor our approach to the task of localization and trajectory tracking on a multirotor, by learning a visuomotor policy that generates control actions using images from the onboard camera as only source of horizontal position. Our evaluations numerically demonstrate learning of a robust visuomotor policy with an 80-fold increase in demonstration efficiency and a 50% reduction in training time over current IL methods. Additionally, our policies successfully transfer to a real multirotor, achieving accurate localization and low tracking errors despite large disturbances, with an onboard inference time of only 1.5 ms.

* Video: https://youtu.be/_W5z33ZK1m4. Evolved paper from our previous work: arXiv:2210.10127

Via

Access Paper or Ask Questions

EVORA: Deep Evidential Traversability Learning for Risk-Aware Off-Road Autonomy

Nov 10, 2023

Xiaoyi Cai, Siddharth Ancha, Lakshay Sharma, Philip R. Osteen, Bernadette Bucher, Stephen Phillips, Jiuguang Wang, Michael Everett, Nicholas Roy, Jonathan P. How

Figure 1 for EVORA: Deep Evidential Traversability Learning for Risk-Aware Off-Road Autonomy

Figure 2 for EVORA: Deep Evidential Traversability Learning for Risk-Aware Off-Road Autonomy

Figure 3 for EVORA: Deep Evidential Traversability Learning for Risk-Aware Off-Road Autonomy

Figure 4 for EVORA: Deep Evidential Traversability Learning for Risk-Aware Off-Road Autonomy

Abstract:Traversing terrain with good traction is crucial for achieving fast off-road navigation. Instead of manually designing costs based on terrain features, existing methods learn terrain properties directly from data via self-supervision, but challenges remain to properly quantify and mitigate risks due to uncertainties in learned models. This work efficiently quantifies both aleatoric and epistemic uncertainties by learning discrete traction distributions and probability densities of the traction predictor's latent features. Leveraging evidential deep learning, we parameterize Dirichlet distributions with the network outputs and propose a novel uncertainty-aware squared Earth Mover's distance loss with a closed-form expression that improves learning accuracy and navigation performance. The proposed risk-aware planner simulates state trajectories with the worst-case expected traction to handle aleatoric uncertainty, and penalizes trajectories moving through terrain with high epistemic uncertainty. Our approach is extensively validated in simulation and on wheeled and quadruped robots, showing improved navigation performance compared to methods that assume no slip, assume the expected traction, or optimize for the worst-case expected cost.

* Under review. Journal extension for arXiv:2210.00153. Project website: https://xiaoyi-cai.github.io/evora/

Via

Access Paper or Ask Questions

PUMA: Fully Decentralized Uncertainty-aware Multiagent Trajectory Planner with Real-time Image Segmentation-based Frame Alignment

Nov 07, 2023

Kota Kondo, Claudius T. Tewari, Mason B. Peterson, Annika Thomas, Jouko Kinnari, Andrea Tagliabue, Jonathan P. How

Abstract:Fully decentralized, multiagent trajectory planners enable complex tasks like search and rescue or package delivery by ensuring safe navigation in unknown environments. However, deconflicting trajectories with other agents and ensuring collision-free paths in a fully decentralized setting is complicated by dynamic elements and localization uncertainty. To this end, this paper presents (1) an uncertainty-aware multiagent trajectory planner and (2) an image segmentation-based frame alignment pipeline. The uncertainty-aware planner propagates uncertainty associated with the future motion of detected obstacles, and by incorporating this propagated uncertainty into optimization constraints, the planner effectively navigates around obstacles. Unlike conventional methods that emphasize explicit obstacle tracking, our approach integrates implicit tracking. Sharing trajectories between agents can cause potential collisions due to frame misalignment. Addressing this, we introduce a novel frame alignment pipeline that rectifies inter-agent frame misalignment. This method leverages a zero-shot image segmentation model for detecting objects in the environment and a data association framework based on geometric consistency for map alignment. Our approach accurately aligns frames with only 0.18 m and 2.7 deg of mean frame alignment error in our most challenging simulation scenario. In addition, we conducted hardware experiments and successfully achieved 0.29 m and 2.59 deg of frame alignment error. Together with the alignment framework, our planner ensures safe navigation in unknown environments and collision avoidance in decentralized settings.

* 7 pages, 13 figures, conference paper

Via

Access Paper or Ask Questions

REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots

Nov 02, 2023

Andrea Tagliabue, Kota Kondo, Tong Zhao, Mason Peterson, Claudius T. Tewari, Jonathan P. How

Figure 1 for REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots

Figure 2 for REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots

Figure 3 for REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots

Abstract:Large Language Models (LLMs) pre-trained on internet-scale datasets have shown impressive capabilities in code understanding, synthesis, and general purpose question-and-answering. Key to their performance is the substantial prior knowledge acquired during training and their ability to reason over extended sequences of symbols, often presented in natural language. In this work, we aim to harness the extensive long-term reasoning, natural language comprehension, and the available prior knowledge of LLMs for increased resilience and adaptation in autonomous mobile robots. We introduce REAL, an approach for REsilience and Adaptation using LLMs. REAL provides a strategy to employ LLMs as a part of the mission planning and control framework of an autonomous robot. The LLM employed by REAL provides (i) a source of prior knowledge to increase resilience for challenging scenarios that the system had not been explicitly designed for; (ii) a way to interpret natural-language and other log/diagnostic information available in the autonomy stack, for mission planning; (iii) a way to adapt the control inputs using minimal user-provided prior knowledge about the dynamics/kinematics of the robot. We integrate REAL in the autonomy stack of a real multirotor, querying onboard an offboard LLM at 0.1-1.0 Hz as part the robot's mission planning and control feedback loops. We demonstrate in real-world experiments the ability of the LLM to reduce the position tracking errors of a multirotor under the presence of (i) errors in the parameters of the controller and (ii) unmodeled dynamics. We also show (iii) decision making to avoid potentially dangerous scenarios (e.g., robot oscillates) that had not been explicitly accounted for in the initial prompt design.

* 13 pages, 5 figures, conference workshop

Via

Access Paper or Ask Questions

Wide-Area Geolocalization with a Limited Field of View Camera in Challenging Urban Environments

Aug 14, 2023

Lena M. Downes, Ted J. Steiner, Rebecca L. Russell, Jonathan P. How

Figure 1 for Wide-Area Geolocalization with a Limited Field of View Camera in Challenging Urban Environments

Figure 2 for Wide-Area Geolocalization with a Limited Field of View Camera in Challenging Urban Environments

Figure 3 for Wide-Area Geolocalization with a Limited Field of View Camera in Challenging Urban Environments

Figure 4 for Wide-Area Geolocalization with a Limited Field of View Camera in Challenging Urban Environments

Abstract:Cross-view geolocalization, a supplement or replacement for GPS, localizes an agent within a search area by matching ground-view images to overhead images. Significant progress has been made assuming a panoramic ground camera. Panoramic cameras' high complexity and cost make non-panoramic cameras more widely applicable, but also more challenging since they yield less scene overlap between ground and overhead images. This paper presents Restricted FOV Wide-Area Geolocalization (ReWAG), a cross-view geolocalization approach that combines a neural network and particle filter to globally localize a mobile agent with only odometry and a non-panoramic camera. ReWAG creates pose-aware embeddings and provides a strategy to incorporate particle pose into the Siamese network, improving localization accuracy by a factor of 100 compared to a vision transformer baseline. This extended work also presents ReWAG*, which improves upon ReWAG's generalization ability in previously unseen environments. ReWAG* repeatedly converges accurately on a dataset of images we have collected in Boston with a 72 degree field of view (FOV) camera, a location and FOV that ReWAG* was not trained on.

* 10 pages, 16 figures. Extension of ICRA 2023 paper arXiv:2209.11854

Via

Access Paper or Ask Questions

RAYEN: Imposition of Hard Convex Constraints on Neural Networks

Jul 17, 2023

Jesus Tordesillas, Jonathan P. How, Marco Hutter

Abstract:This paper presents RAYEN, a framework to impose hard convex constraints on the output or latent variable of a neural network. RAYEN guarantees that, for any input or any weights of the network, the constraints are satisfied at all times. Compared to other approaches, RAYEN does not perform a computationally-expensive orthogonal projection step onto the feasible set, does not rely on soft constraints (which do not guarantee the satisfaction of the constraints at test time), does not use conservative approximations of the feasible set, and does not perform a potentially slow inner gradient descent correction to enforce the constraints. RAYEN supports any combination of linear, convex quadratic, second-order cone (SOC), and linear matrix inequality (LMI) constraints, achieving a very small computational overhead compared to unconstrained networks. For example, it is able to impose 1K quadratic constraints on a 1K-dimensional variable with an overhead of less than 8 ms, and an LMI constraint with 300x300 dense matrices on a 10K-dimensional variable in less than 12 ms. When used in neural networks that approximate the solution of constrained optimization problems, RAYEN achieves computation times between 20 and 7468 times faster than state-of-the-art algorithms, while guaranteeing the satisfaction of the constraints at all times and obtaining a cost very close to the optimal one.

Via

Access Paper or Ask Questions