Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yasamin Borhani

PoseDriver: A Unified Approach to Multi-Category Skeleton Detection for Autonomous Driving

Mar 25, 2026

Yasamin Borhani, Taylor Mordan, Yihan Wang, Reyhaneh Hosseininejad, Javad Khoramdel, Alexandre Alahi

Abstract:Object skeletons offer a concise representation of structural information, capturing essential aspects of posture and orientation that are crucial for autonomous driving applications. However, a unified architecture that simultaneously handles multiple instances and categories using only the input image remains elusive. In this paper, we introduce PoseDriver, a unified framework for bottom-up multi-category skeleton detection tailored to common objects in driving scenarios. We model each category as a distinct task to systematically address the challenges of multi-task learning. Specifically, we propose a novel approach for lane detection based on skeleton representations, achieving state-of-the-art performance on the OpenLane dataset. Moreover, we present a new dataset for bicycle skeleton detection and assess the transferability of our framework to novel categories. Experimental results validate the effectiveness of the proposed approach.

Via

Access Paper or Ask Questions

YOLO-Former: YOLO Shakes Hand With ViT

Jan 11, 2024

Javad Khoramdel, Ahmad Moori, Yasamin Borhani, Armin Ghanbarzadeh, Esmaeil Najafi

Abstract:The proposed YOLO-Former method seamlessly integrates the ideas of transformer and YOLOv4 to create a highly accurate and efficient object detection system. The method leverages the fast inference speed of YOLOv4 and incorporates the advantages of the transformer architecture through the integration of convolutional attention and transformer modules. The results demonstrate the effectiveness of the proposed approach, with a mean average precision (mAP) of 85.76\% on the Pascal VOC dataset, while maintaining high prediction speed with a frame rate of 10.85 frames per second. The contribution of this work lies in the demonstration of how the innovative combination of these two state-of-the-art techniques can lead to further improvements in the field of object detection.

Via

Access Paper or Ask Questions