Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hanumant Singh

Sparse Variable Projection in Robotic Perception: Exploiting Separable Structure for Efficient Nonlinear Optimization

Dec 08, 2025

Alan Papalia, Nikolas Sanderson, Haoyu Han, Heng Yang, Hanumant Singh, Michael Everett

Abstract:Robotic perception often requires solving large nonlinear least-squares (NLS) problems. While sparsity has been well-exploited to scale solvers, a complementary and underexploited structure is \emph{separability} -- where some variables (e.g., visual landmarks) appear linearly in the residuals and, for any estimate of the remaining variables (e.g., poses), have a closed-form solution. Variable projection (VarPro) methods are a family of techniques that exploit this structure by analytically eliminating the linear variables and presenting a reduced problem in the remaining variables that has favorable properties. However, VarPro has seen limited use in robotic perception; a major challenge arises from gauge symmetries (e.g., cost invariance to global shifts and rotations), which are common in perception and induce specific computational challenges in standard VarPro approaches. We present a VarPro scheme designed for problems with gauge symmetries that jointly exploits separability and sparsity. Our method can be applied as a one-time preprocessing step to construct a \emph{matrix-free Schur complement operator}. This operator allows efficient evaluation of costs, gradients, and Hessian-vector products of the reduced problem and readily integrates with standard iterative NLS solvers. We provide precise conditions under which our method applies, and describe extensions when these conditions are only partially met. Across synthetic and real benchmarks in SLAM, SNL, and SfM, our approach achieves up to \textbf{2$\times$--35$\times$ faster runtimes} than state-of-the-art methods while maintaining accuracy. We release an open-source C++ implementation and all datasets from our experiments.

* 8 pages, submitted for review

Via

Access Paper or Ask Questions

A Roadmap for Climate-Relevant Robotics Research

Jul 15, 2025

Alan Papalia, Charles Dawson, Laurentiu L. Anton, Norhan Magdy Bayomi, Bianca Champenois, Jung-Hoon Cho, Levi Cai, Joseph DelPreto, Kristen Edwards, Bilha-Catherine Githinji(+18 more)

Figure 1 for A Roadmap for Climate-Relevant Robotics Research

Figure 2 for A Roadmap for Climate-Relevant Robotics Research

Figure 3 for A Roadmap for Climate-Relevant Robotics Research

Figure 4 for A Roadmap for Climate-Relevant Robotics Research

Abstract:Climate change is one of the defining challenges of the 21st century, and many in the robotics community are looking for ways to contribute. This paper presents a roadmap for climate-relevant robotics research, identifying high-impact opportunities for collaboration between roboticists and experts across climate domains such as energy, the built environment, transportation, industry, land use, and Earth sciences. These applications include problems such as energy systems optimization, construction, precision agriculture, building envelope retrofits, autonomous trucking, and large-scale environmental monitoring. Critically, we include opportunities to apply not only physical robots but also the broader robotics toolkit - including planning, perception, control, and estimation algorithms - to climate-relevant problems. A central goal of this roadmap is to inspire new research directions and collaboration by highlighting specific, actionable problems at the intersection of robotics and climate. This work represents a collaboration between robotics researchers and domain experts in various climate disciplines, and it serves as an invitation to the robotics community to bring their expertise to bear on urgent climate priorities.

Via

Access Paper or Ask Questions

Enhancing Situational Awareness in Underwater Robotics with Multi-modal Spatial Perception

Jun 06, 2025

Pushyami Kaveti, Ambjorn Grimsrud Waldum, Hanumant Singh, Martin Ludvigsen

Figure 1 for Enhancing Situational Awareness in Underwater Robotics with Multi-modal Spatial Perception

Figure 2 for Enhancing Situational Awareness in Underwater Robotics with Multi-modal Spatial Perception

Figure 3 for Enhancing Situational Awareness in Underwater Robotics with Multi-modal Spatial Perception

Figure 4 for Enhancing Situational Awareness in Underwater Robotics with Multi-modal Spatial Perception

Abstract:Autonomous Underwater Vehicles (AUVs) and Remotely Operated Vehicles (ROVs) demand robust spatial perception capabilities, including Simultaneous Localization and Mapping (SLAM), to support both remote and autonomous tasks. Vision-based systems have been integral to these advancements, capturing rich color and texture at low cost while enabling semantic scene understanding. However, underwater conditions -- such as light attenuation, backscatter, and low contrast -- often degrade image quality to the point where traditional vision-based SLAM pipelines fail. Moreover, these pipelines typically rely on monocular or stereo inputs, limiting their scalability to the multi-camera configurations common on many vehicles. To address these issues, we propose to leverage multi-modal sensing that fuses data from multiple sensors-including cameras, inertial measurement units (IMUs), and acoustic devices-to enhance situational awareness and enable robust, real-time SLAM. We explore both geometric and learning-based techniques along with semantic analysis, and conduct experiments on the data collected from a work-class ROV during several field deployments in the Trondheim Fjord. Through our experimental results, we demonstrate the feasibility of real-time reliable state estimation and high-quality 3D reconstructions in visually challenging underwater conditions. We also discuss system constraints and identify open research questions, such as sensor calibration, limitations with learning-based methods, that merit further exploration to advance large-scale underwater operations.

Via

Access Paper or Ask Questions

NeuFlow v2: High-Efficiency Optical Flow Estimation on Edge Devices

Aug 21, 2024

Zhiyong Zhang, Aniket Gupta, Huaizu Jiang, Hanumant Singh

Abstract:Real-time high-accuracy optical flow estimation is crucial for various real-world applications. While recent learning-based optical flow methods have achieved high accuracy, they often come with significant computational costs. In this paper, we propose a highly efficient optical flow method that balances high accuracy with reduced computational demands. Building upon NeuFlow v1, we introduce new components including a much more light-weight backbone and a fast refinement module. Both these modules help in keeping the computational demands light while providing close to state of the art accuracy. Compares to other state of the art methods, our model achieves a 10x-70x speedup while maintaining comparable performance on both synthetic and real-world data. It is capable of running at over 20 FPS on 512x384 resolution images on a Jetson Orin Nano. The full training and evaluation code is available at https://github.com/neufieldrobotics/NeuFlow_v2.

Via

Access Paper or Ask Questions

Towards Long Term SLAM on Thermal Imagery

Mar 28, 2024

Colin Keil, Aniket Gupta, Pushyami Kaveti, Hanumant Singh

Abstract:Visual SLAM with thermal imagery, and other low contrast visually degraded environments such as underwater, or in areas dominated by snow and ice, remain a difficult problem for many state of the art (SOTA) algorithms. In addition to challenging front-end data association, thermal imagery presents an additional difficulty for long term relocalization and map reuse. The relative temperatures of objects in thermal imagery change dramatically from day to night. Feature descriptors typically used for relocalization in SLAM are unable to maintain consistency over these diurnal changes. We show that learned feature descriptors can be used within existing Bag of Word based localization schemes to dramatically improve place recognition across large temporal gaps in thermal imagery. In order to demonstrate the effectiveness of our trained vocabulary, we have developed a baseline SLAM system, integrating learned features and matching into a classical SLAM algorithm. Our system demonstrates good local tracking on challenging thermal imagery, and relocalization that overcomes dramatic day to night thermal appearance changes. Our code and datasets are available here: https://github.com/neufieldrobotics/IRSLAM_Baseline

* 8 pages, 7 figures, Submitted to IROS 2024

Via

Access Paper or Ask Questions

On Designing Consistent Covariance Recovery from a Deep Learning Visual Odometry Engine

Mar 19, 2024

Jagatpreet Singh Nir, Dennis Giaya, Hanumant Singh

Figure 1 for On Designing Consistent Covariance Recovery from a Deep Learning Visual Odometry Engine

Figure 2 for On Designing Consistent Covariance Recovery from a Deep Learning Visual Odometry Engine

Figure 3 for On Designing Consistent Covariance Recovery from a Deep Learning Visual Odometry Engine

Figure 4 for On Designing Consistent Covariance Recovery from a Deep Learning Visual Odometry Engine

Abstract:Deep learning techniques have significantly advanced in providing accurate visual odometry solutions by leveraging large datasets. However, generating uncertainty estimates for these methods remains a challenge. Traditional sensor fusion approaches in a Bayesian framework are well-established, but deep learning techniques with millions of parameters lack efficient methods for uncertainty estimation. This paper addresses the issue of uncertainty estimation for pre-trained deep-learning models in monocular visual odometry. We propose formulating a factor graph on an implicit layer of the deep learning network to recover relative covariance estimates, which allows us to determine the covariance of the Visual Odometry (VO) solution. We showcase the consistency of the deep learning engine's covariance approximation with an empirical analysis of the covariance model on the EUROC datasets to demonstrate the correctness of our formulation.

* Submitted to IROS 2024

Via

Access Paper or Ask Questions

NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices

Mar 15, 2024

Zhiyong Zhang, Huaizu Jiang, Hanumant Singh

Figure 1 for NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices

Figure 2 for NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices

Figure 3 for NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices

Figure 4 for NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices

Abstract:Real-time high-accuracy optical flow estimation is a crucial component in various applications, including localization and mapping in robotics, object tracking, and activity recognition in computer vision. While recent learning-based optical flow methods have achieved high accuracy, they often come with heavy computation costs. In this paper, we propose a highly efficient optical flow architecture, called NeuFlow, that addresses both high accuracy and computational cost concerns. The architecture follows a global-to-local scheme. Given the features of the input images extracted at different spatial resolutions, global matching is employed to estimate an initial optical flow on the 1/16 resolution, capturing large displacement, which is then refined on the 1/8 resolution with lightweight CNN layers for better accuracy. We evaluate our approach on Jetson Orin Nano and RTX 2080 to demonstrate efficiency improvements across different computing platforms. We achieve a notable 10x-80x speedup compared to several state-of-the-art methods, while maintaining comparable accuracy. Our approach achieves around 30 FPS on edge computing platforms, which represents a significant breakthrough in deploying complex computer vision tasks such as SLAM on small robots like drones. The full training and evaluation code is available at https://github.com/neufieldrobotics/NeuFlow.

Via

Access Paper or Ask Questions

OASIS: Optimal Arrangements for Sensing in SLAM

Sep 19, 2023

Pushyami Kaveti, Matthew Giamou, Hanumant Singh, David M. Rosen

Abstract:The number and arrangement of sensors on an autonomous mobile robot dramatically influence its perception capabilities. Ensuring that sensors are mounted in a manner that enables accurate detection, localization, and mapping is essential for the success of downstream control tasks. However, when designing a new robotic platform, researchers and practitioners alike usually mimic standard configurations or maximize simple heuristics like field-of-view (FOV) coverage to decide where to place exteroceptive sensors. In this work, we conduct an information-theoretic investigation of this overlooked element of mobile robotic perception in the context of simultaneous localization and mapping (SLAM). We show how to formalize the sensor arrangement problem as a form of subset selection under the E-optimality performance criterion. While this formulation is NP-hard in general, we further show that a combination of greedy sensor selection and fast convex relaxation-based post-hoc verification enables the efficient recovery of certifiably optimal sensor designs in practice. Results from synthetic experiments reveal that sensors placed with OASIS outperform benchmarks in terms of mean squared error of visual SLAM estimates.

Via

Access Paper or Ask Questions

Direct Superpoints Matching for Fast and Robust Point Cloud Registration

Jul 03, 2023

Aniket Gupta, Yiming Xie, Hanumant Singh, Huaizu Jiang

Figure 1 for Direct Superpoints Matching for Fast and Robust Point Cloud Registration

Figure 2 for Direct Superpoints Matching for Fast and Robust Point Cloud Registration

Figure 3 for Direct Superpoints Matching for Fast and Robust Point Cloud Registration

Figure 4 for Direct Superpoints Matching for Fast and Robust Point Cloud Registration

Abstract:Although deep neural networks endow the downsampled superpoints with discriminative feature representations, directly matching them is usually not used alone in state-of-the-art methods, mainly for two reasons. First, the correspondences are inevitably noisy, so RANSAC-like refinement is usually adopted. Such ad hoc postprocessing, however, is slow and not differentiable, which can not be jointly optimized with feature learning. Second, superpoints are sparse and thus more RANSAC iterations are needed. Existing approaches use the coarse-to-fine strategy to propagate the superpoints correspondences to the point level, which are not discriminative enough and further necessitates the postprocessing refinement. In this paper, we present a simple yet effective approach to extract correspondences by directly matching superpoints using a global softmax layer in an end-to-end manner, which are used to determine the rigid transformation between the source and target point cloud. Compared with methods that directly predict corresponding points, by leveraging the rich information from the superpoints matchings, we can obtain more accurate estimation of the transformation and effectively filter out outliers without any postprocessing refinement. As a result, our approach is not only fast, but also achieves state-of-the-art results on the challenging ModelNet and 3DMatch benchmarks. Our code and model weights will be publicly released.

Via

Access Paper or Ask Questions

Challenges of Indoor SLAM: A multi-modal multi-floor dataset for SLAM evaluation

Jun 14, 2023

Pushyami Kaveti, Aniket Gupta, Dennis Giaya, Madeline Karp, Colin Keil, Jagatpreet Nir, Zhiyong Zhang, Hanumant Singh

Figure 1 for Challenges of Indoor SLAM: A multi-modal multi-floor dataset for SLAM evaluation

Figure 2 for Challenges of Indoor SLAM: A multi-modal multi-floor dataset for SLAM evaluation

Figure 3 for Challenges of Indoor SLAM: A multi-modal multi-floor dataset for SLAM evaluation

Figure 4 for Challenges of Indoor SLAM: A multi-modal multi-floor dataset for SLAM evaluation

Abstract:Robustness in Simultaneous Localization and Mapping (SLAM) remains one of the key challenges for the real-world deployment of autonomous systems. SLAM research has seen significant progress in the last two and a half decades, yet many state-of-the-art (SOTA) algorithms still struggle to perform reliably in real-world environments. There is a general consensus in the research community that we need challenging real-world scenarios which bring out different failure modes in sensing modalities. In this paper, we present a novel multi-modal indoor SLAM dataset covering challenging common scenarios that a robot will encounter and should be robust to. Our data was collected with a mobile robotics platform across multiple floors at Northeastern University's ISEC building. Such a multi-floor sequence is typical of commercial office spaces characterized by symmetry across floors and, thus, is prone to perceptual aliasing due to similar floor layouts. The sensor suite comprises seven global shutter cameras, a high-grade MEMS inertial measurement unit (IMU), a ZED stereo camera, and a 128-channel high-resolution lidar. Along with the dataset, we benchmark several SLAM algorithms and highlight the problems faced during the runs, such as perceptual aliasing, visual degradation, and trajectory drift. The benchmarking results indicate that parts of the dataset work well with some algorithms, while other data sections are challenging for even the best SOTA algorithms. The dataset is available at https://github.com/neufieldrobotics/NUFR-M3F.

Via

Access Paper or Ask Questions