Abstract:Robotic manipulation in contact-rich environments remains challenging, particularly when relying on conventional tactile sensors that suffer from limited sensing range, reliability, and cost-effectiveness. In this work, we present LVTG, a low-cost visuo-tactile gripper designed for stable, robust, and efficient physical interaction. Unlike existing visuo-tactile sensors, LVTG enables more effective and stable grasping of larger and heavier everyday objects, thanks to its enhanced tactile sensing area and greater opening angle. Its surface skin is made of highly wear-resistant material, significantly improving durability and extending operational lifespan. The integration of vision and tactile feedback allows LVTG to provide rich, high-fidelity sensory data, facilitating reliable perception during complex manipulation tasks. Furthermore, LVTG features a modular design that supports rapid maintenance and replacement. To effectively fuse vision and touch, We adopt a CLIP-inspired contrastive learning objective to align tactile embeddings with their corresponding visual observations, enabling a shared cross-modal representation space for visuo-tactile perception. This alignment improves the performance of an Action Chunking Transformer (ACT) policy in contact-rich manipulation, leading to more efficient data collection and more effective policy learning. Compared to the original ACT method, the proposed LVTG with pretraining achieves significantly higher success rates in manipulation tasks.
Abstract:Low-cost inertial navigation systems (INS) are prone to sensor biases and measurement noise, which lead to rapid degradation of navigation accuracy during global positioning system (GPS) outages. To address this challenge and improve positioning continuity in GPS-denied environments, this paper proposes a brain-inspired GPS/INS fusion network (BGFN) based on spiking neural networks (SNNs). The BGFN architecture integrates a spiking Transformer with a spiking encoder to simultaneously extract spatial features from inertial measurement unit (IMU) signals and capture their temporal dynamics. By modeling the relationship between vehicle attitude, specific force, angular rate, and GPS-derived position increments, the network leverages both current and historical IMU data to estimate vehicle motion. The effectiveness of the proposed method is evaluated through real-world field tests and experiments on public datasets. Compared to conventional deep learning approaches, the results demonstrate that BGFN achieves higher accuracy and enhanced reliability in navigation performance, particularly under prolonged GPS outages.