Abstract:As autonomous robotic systems become increasingly mature, users will want to specify missions at the level of intent rather than in low-level detail. Language is an expressive and intuitive medium for such mission specification. However, realizing language-guided robotic teams requires overcoming significant technical hurdles. Interpreting and realizing language-specified missions requires advanced semantic reasoning. Successful heterogeneous robots must effectively coordinate actions and share information across varying viewpoints. Additionally, communication between robots is typically intermittent, necessitating robust strategies that leverage communication opportunities to maintain coordination and achieve mission objectives. In this work, we present a first-of-its-kind system where an unmanned aerial vehicle (UAV) and an unmanned ground vehicle (UGV) are able to collaboratively accomplish missions specified in natural language while reacting to changes in specification on the fly. We leverage a Large Language Model (LLM)-enabled planner to reason over semantic-metric maps that are built online and opportunistically shared between an aerial and a ground robot. We consider task-driven navigation in urban and rural areas. Our system must infer mission-relevant semantics and actively acquire information via semantic mapping. In both ground and air-ground teaming experiments, we demonstrate our system on seven different natural-language specifications at up to kilometer-scale navigation.
Abstract:The integration of foundation models (FMs) into robotics has enabled robots to understand natural language and reason about the semantics in their environments. However, existing FM-enabled robots primary operate in closed-world settings, where the robot is given a full prior map or has a full view of its workspace. This paper addresses the deployment of FM-enabled robots in the field, where missions often require a robot to operate in large-scale and unstructured environments. To effectively accomplish these missions, robots must actively explore their environments, navigate obstacle-cluttered terrain, handle unexpected sensor inputs, and operate with compute constraints. We discuss recent deployments of SPINE, our LLM-enabled autonomy framework, in field robotic settings. To the best of our knowledge, we present the first demonstration of large-scale LLM-enabled robot planning in unstructured environments with several kilometers of missions. SPINE is agnostic to a particular LLM, which allows us to distill small language models capable of running onboard size, weight and power (SWaP) limited platforms. Via preliminary model distillation work, we then present the first language-driven UAV planner using on-device language models. We conclude our paper by proposing several promising directions for future research.
Abstract:In this letter, we address the task of adaptive sampling to model vector fields. When modeling environmental phenomena with a robot, gathering high resolution information can be resource intensive. Actively gathering data and modeling flows with the data is a more efficient alternative. However, in such scenarios, data is often sparse and thus requires flow modeling techniques that are effective at capturing the relevant dynamical features of the flow to ensure high prediction accuracy of the resulting models. To accomplish this effectively, regions with high informative value must be identified. We propose EnKode, an active sampling approach based on Koopman Operator theory and ensemble methods that can build high quality flow models and effectively estimate model uncertainty. For modeling complex flows, EnKode provides comparable or better estimates of unsampled flow regions than Gaussian Process Regression models with hyperparameter optimization. Additionally, our active sensing scheme provides more accurate flow estimates than comparable strategies that rely on uniform sampling. We evaluate EnKode using three common benchmarking systems: the Bickley Jet, Lid-Driven Cavity flow with an obstacle, and real ocean currents from the National Oceanic and Atmospheric Administration (NOAA).
Abstract:Flying quadrotors in tight formations is a challenging problem. It is known that in the near-field airflow of a quadrotor, the aerodynamic effects induced by the propellers are complex and difficult to characterize. Although machine learning tools can potentially be used to derive models that capture these effects, these data-driven approaches can be sample inefficient and the resulting models often do not generalize as well as their first-principles counterparts. In this work, we propose a framework that combines the benefits of first-principles modeling and data-driven approaches to construct an accurate and sample efficient representation of the complex aerodynamic effects resulting from quadrotors flying in formation. The data-driven component within our model is lightweight, making it amenable for optimization-based control design. Through simulations and physical experiments, we show that incorporating the model into a novel learning-based nonlinear model predictive control (MPC) framework results in substantial performance improvements in terms of trajectory tracking and disturbance rejection. In particular, our framework significantly outperforms nominal MPC in physical experiments, achieving a 40.1% improvement in the average trajectory tracking errors and a 57.5% reduction in the maximum vertical separation errors. Our framework also achieves exceptional sample efficiency, using only a total of 46 seconds of flight data for training across both simulations and physical experiments. Furthermore, with our proposed framework, the quadrotors achieve an exceptionally tight formation, flying with an average separation of less than 1.5 body lengths throughout the flight. A video illustrating our framework and physical experiments is given here: https://youtu.be/Hv-0JiVoJGo
Abstract:Traditionally, unmanned aerial vehicles (UAVs) rely on CMOS-based cameras to collect images about the world below. One of the most successful applications of UAVs is to generate orthomosaics or orthomaps, in which a series of images are integrated together to develop a larger map. However, the use of CMOS-based cameras with global or rolling shutters mean that orthomaps are vulnerable to challenging light conditions, motion blur, and high-speed motion of independently moving objects under the camera. Event cameras are less sensitive to these issues, as their pixels are able to trigger asynchronously on brightness changes. This work introduces the first orthomosaic approach using event cameras. In contrast to existing methods relying only on CMOS cameras, our approach enables map generation even in challenging light conditions, including direct sunlight and after sunset.
Abstract:Coordinating the motion of multiple robots in cluttered environments remains a computationally challenging task. We study the problem of minimizing the execution time of a set of geometric paths by a team of robots with state-dependent actuation constraints. We propose a Time-Optimal Path Parameterization (TOPP) algorithm for multiple car-like agents, where the modulation of the timing of every robot along its assigned path is employed to ensure collision avoidance and dynamic feasibility. This is achieved through the use of a priority queue to determine the order of trajectory execution for each robot while taking into account all possible collisions with higher priority robots in a spatiotemporal graph. We show a 10-20% reduction in makespan against existing state-of-the-art methods and validate our approach through simulations and hardware experiments.
Abstract:Accurate and efficient fluid flow models are essential for applications relating to many physical phenomena including geophysical, aerodynamic, and biological systems. While these flows may exhibit rich and multiscale dynamics, in many cases underlying low-rank structures exist which describe the bulk of the motion. These structures tend to be spatially large and temporally slow, and may contain most of the energy in a given flow. The extraction and parsimonious representation of these low-rank dynamics from high-dimensional data is a key challenge. Inspired by the success of physics-informed machine learning methods, we propose a spectrally-informed approach to extract low-rank models of fluid flows by leveraging known spectral properties in the learning process. We incorporate this knowledge by imposing regularizations on the learned dynamics, which bias the training process towards learning low-frequency structures with corresponding higher power. We demonstrate the effectiveness of this method to improve prediction and produce learned models which better match the underlying spectral properties of prototypical fluid flows.
Abstract:Soft robots have many advantages over rigid robots thanks to their compliant and passive nature. However, it is generally challenging to model the dynamics of soft robots due to their high spatial dimensionality, making it difficult to use model-based methods to accurately control soft robots. It often requires direct numerical simulation of partial differential equations to simulate soft robots. This not only requires an accurate numerical model, but also makes soft robot modeling slow and expensive. Deep learning algorithms have shown promises in data-driven modeling of soft robots. However, these algorithms usually require a large amount of data, which are difficult to obtain in either simulation or real-world experiments of soft robots. In this work, we propose KNODE-Cosserat, a framework that combines first-principle physics models and neural ordinary differential equations. We leverage the best from both worlds -- the generalization ability of physics-based models and the fast speed of deep learning methods. We validate our framework in both simulation and real-world experiments. In both cases, we show that the robot model significantly improves over the baseline models under different metrics.
Abstract:One common and desirable application of robots is exploring potentially hazardous and unstructured environments. Air-ground collaboration offers a synergistic approach to addressing such exploration challenges. In this paper, we demonstrate a system for large-scale exploration using a team of aerial and ground robots. Our system uses semantics as lingua franca, and relies on fully opportunistic communications. We highlight the unique challenges from this approach, explain our system architecture and showcase lessons learned during our experiments. All our code is open-source, encouraging researchers to use it and build upon.
Abstract:Climate change has increased the frequency and severity of extreme weather events such as hurricanes and winter storms. The complex interplay of floods with tides, runoff, and sediment creates additional hazards -- including erosion and the undermining of urban infrastructure -- consequently impacting the health of our rivers and ecosystems. Observations of these underwater phenomena are rare, because satellites and sensors mounted on aerial vehicles cannot penetrate the murky waters. Autonomous Surface Vehicles (ASVs) provides a means to track and map these complex and dynamic underwater phenomena. This work highlights preliminary results of high-resolution data gathering with ASVs, equipped with a suite of sensors capable of measuring physical and chemical parameters of the river. Measurements were acquired along the lower Schuylkill River in the Philadelphia area at high-tide and low-tide conditions. The data will be leveraged to improve our understanding of changes in bathymetry due to floods; the dynamics of mixing and stagnation zones and their impact on water quality; and the dynamics of suspension and resuspension of fine sediment. The data will also provide insight into the development of adaptive sampling strategies for ASVs that can maximize the information gain for future field experiments.