Abstract:Indoor scene understanding remains a fundamental challenge in robotics, with direct implications for downstream tasks such as navigation and manipulation. Traditional approaches often rely on closed-set recognition or loop closure, limiting their adaptability in open-world environments. With the advent of visual foundation models (VFMs), open-vocabulary recognition and natural language querying have become feasible, unlocking new possibilities for 3D scene graph construction. In this paper, we propose a robust and efficient framework for instance-level 3D scene graph construction via LiDAR-camera fusion. Leveraging LiDAR's wide field of view (FOV) and long-range sensing capabilities, we rapidly acquire room-level geometric priors. Multi-level VFMs are employed to improve the accuracy and consistency of semantic extraction. During instance fusion, room-based segmentation enables parallel processing, while the integration of geometric and semantic cues significantly enhances fusion accuracy and robustness. Compared to state-of-the-art methods, our approach achieves up to an order-of-magnitude improvement in construction speed while maintaining high semantic precision. Extensive experiments in both simulated and real-world environments validate the effectiveness of our approach. We further demonstrate its practical value through a language-guided semantic navigation task, highlighting its potential for real-world robotic applications.
Abstract:Efficient motion planning for Aerial Manipulators (AMs) is essential for tackling complex manipulation tasks, yet achieving coupled trajectory planning remains challenging. In this work, we propose, to the best of our knowledge, the first whole-body integrated motion planning framework for aerial manipulators, which is facilitated by an improved Safe Flight Corridor (SFC) generation strategy and high-dimensional collision-free trajectory planning. In particular, we formulate an optimization problem to generate feasible trajectories for both the quadrotor and manipulator while ensuring collision avoidance, dynamic feasibility, kinematic feasibility, and waypoint constraints. To achieve collision avoidance, we introduce a variable geometry approximation method, which dynamically models the changing collision volume induced by different manipulator configurations. Moreover, waypoint constraints in our framework are defined in $\mathrm{SE(3)\times\mathbb{R}^3}$, allowing the aerial manipulator to traverse specified positions while maintaining desired attitudes and end-effector states. The effectiveness of our framework is validated through comprehensive simulations and real-world experiments across various environments.
Abstract:Aerial Manipulators (AMs) provide a versatile platform for various applications, including 3D printing, architecture, and aerial grasping missions. However, their operational speed is often sacrificed to uphold precision. Existing control strategies for AMs often regard the manipulator as a disturbance and employ robust control methods to mitigate its influence. This research focuses on elevating the precision of the end-effector and enhancing the agility of aerial manipulator movements. We present a composite control scheme to address these challenges. Initially, a Nonlinear Disturbance Observer (NDOB) is utilized to compensate for internal coupling effects and external disturbances. Subsequently, manipulator dynamics are processed through a high pass filter to facilitate agile movements. By integrating the proposed control method into a fully autonomous delta-arm-based AM system, we substantiate the controller's efficacy through extensive real-world experiments. The outcomes illustrate that the end-effector can achieve accuracy at the millimeter level.