Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stefan Zihlmann

PicoSAM2: Low-Latency Segmentation In-Sensor for Edge Vision Applications

Jun 24, 2025

Pietro Bonazzi, Nicola Farronato, Stefan Zihlmann, Haotong Qin, Michele Magno

Abstract:Real-time, on-device segmentation is critical for latency-sensitive and privacy-aware applications like smart glasses and IoT devices. We introduce PicoSAM2, a lightweight (1.3M parameters, 336M MACs) promptable segmentation model optimized for edge and in-sensor execution, including the Sony IMX500. It builds on a depthwise separable U-Net, with knowledge distillation and fixed-point prompt encoding to learn from the Segment Anything Model 2 (SAM2). On COCO and LVIS, it achieves 51.9% and 44.9% mIoU, respectively. The quantized model (1.22MB) runs at 14.3 ms on the IMX500-achieving 86 MACs/cycle, making it the only model meeting both memory and compute constraints for in-sensor deployment. Distillation boosts LVIS performance by +3.5% mIoU and +5.1% mAP. These results demonstrate that efficient, promptable segmentation is feasible directly on-camera, enabling privacy-preserving vision without cloud or host processing.

Via

Access Paper or Ask Questions

Autonomous Navigation in Dynamic Human Environments with an Embedded 2D LiDAR-based Person Tracker

Dec 19, 2024

Davide Plozza, Steven Marty, Cyril Scherrer, Simon Schwartz, Stefan Zihlmann, Michele Magno

Figure 1 for Autonomous Navigation in Dynamic Human Environments with an Embedded 2D LiDAR-based Person Tracker

Figure 2 for Autonomous Navigation in Dynamic Human Environments with an Embedded 2D LiDAR-based Person Tracker

Figure 3 for Autonomous Navigation in Dynamic Human Environments with an Embedded 2D LiDAR-based Person Tracker

Figure 4 for Autonomous Navigation in Dynamic Human Environments with an Embedded 2D LiDAR-based Person Tracker

Abstract:In the rapidly evolving landscape of autonomous mobile robots, the emphasis on seamless human-robot interactions has shifted towards autonomous decision-making. This paper delves into the intricate challenges associated with robotic autonomy, focusing on navigation in dynamic environments shared with humans. It introduces an embedded real-time tracking pipeline, integrated into a navigation planning framework for effective person tracking and avoidance, adapting a state-of-the-art 2D LiDAR-based human detection network and an efficient multi-object tracker. By addressing the key components of detection, tracking, and planning separately, the proposed approach highlights the modularity and transferability of each component to other applications. Our tracking approach is validated on a quadruped robot equipped with 270{\deg} 2D-LiDAR against motion capture system data, with the preferred configuration achieving an average MOTA of 85.45% in three newly recorded datasets, while reliably running in real-time at 20 Hz on the NVIDIA Jetson Xavier NX embedded GPU-accelerated platform. Furthermore, the integrated tracking and avoidance system is evaluated in real-world navigation experiments, demonstrating how accurate person tracking benefits the planner in optimizing the generated trajectories, enhancing its collision avoidance capabilities. This paper contributes to safer human-robot cohabitation, blending recent advances in human detection with responsive planning to navigate shared spaces effectively and securely.

* IEEE Sensors Applications Symposium (SAS), 2024, pp. 1-6
* Accepted by SAS 2024

Via

Access Paper or Ask Questions