Abstract:Learning-based monocular visual odometry (VO) poses robustness, generalization, and efficiency challenges in robotics. Recent advances in visual foundation models, such as DINOv2, have improved robustness and generalization in various vision tasks, yet their integration in VO remains limited due to coarse feature granularity. In this paper, we present DINO-VO, a feature-based VO system leveraging DINOv2 visual foundation model for its sparse feature matching. To address the integration challenge, we propose a salient keypoints detector tailored to DINOv2's coarse features. Furthermore, we complement DINOv2's robust-semantic features with fine-grained geometric features, resulting in more localizable representations. Finally, a transformer-based matcher and differentiable pose estimation layer enable precise camera motion estimation by learning good matches. Against prior detector-descriptor networks like SuperPoint, DINO-VO demonstrates greater robustness in challenging environments. Furthermore, we show superior accuracy and generalization of the proposed feature descriptors against standalone DINOv2 coarse features. DINO-VO outperforms prior frame-to-frame VO methods on the TartanAir and KITTI datasets and is competitive on EuRoC dataset, while running efficiently at 72 FPS with less than 1GB of memory usage on a single GPU. Moreover, it performs competitively against Visual SLAM systems on outdoor driving scenarios, showcasing its generalization capabilities.
Abstract:We present an integrated UAV-hexapod robotic system designed for GNSS-denied maritime operations, capable of autonomous deployment and retrieval of a hexapod robot via a winch mechanism installed on a UAV. This system is intended to address the challenges of localization, control, and mobility in dynamic maritime environments. Our solution leverages sensor fusion techniques, combining optical flow, LiDAR, and depth data for precise localization. Experimental results demonstrate the effectiveness of this system in real-world scenarios, validating its performance during field tests in both controlled and operational conditions in the MBZIRC 2023 Maritime Challenge.
Abstract:In real-world field operations, aerial grasping systems face significant challenges in dynamic environments due to strong winds, shifting surfaces, and the need to handle heavy loads. Particularly when dealing with heavy objects, the powerful propellers of the drone can inadvertently blow the target object away as it approaches, making the task even more difficult. To address these challenges, we introduce SPIBOT, a novel drone-tethered mobile gripper system designed for robust and stable autonomous target retrieval. SPIBOT operates via a tether, much like a spider, allowing the drone to maintain a safe distance from the target. To ensure both stable mobility and secure grasping capabilities, SPIBOT is equipped with six legs and sensors to estimate the robot's and mission's states. It is designed with a reduced volume and weight compared to other hexapod robots, allowing it to be easily stowed under the drone and reeled in as needed. Designed for the 2024 MBZIRC Maritime Grand Challenge, SPIBOT is built to retrieve a 1kg target object in the highly dynamic conditions of the moving deck of a ship. This system integrates a real-time action selection algorithm that dynamically adjusts the robot's actions based on proximity to the mission goal and environmental conditions, enabling rapid and robust mission execution. Experimental results across various terrains, including a pontoon on a lake, a grass field, and rubber mats on coastal sand, demonstrate SPIBOT's ability to efficiently and reliably retrieve targets. SPIBOT swiftly converges on the target and completes its mission, even when dealing with irregular initial states and noisy information introduced by the drone.