Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CLIPVehicle: A Unified Framework for Vision-based Vehicle Search

Aug 06, 2025

Likai Wang, Ruize Han, Xiangqun Zhang, Wei Feng

Figure 1 for CLIPVehicle: A Unified Framework for Vision-based Vehicle Search

Figure 2 for CLIPVehicle: A Unified Framework for Vision-based Vehicle Search

Figure 3 for CLIPVehicle: A Unified Framework for Vision-based Vehicle Search

Figure 4 for CLIPVehicle: A Unified Framework for Vision-based Vehicle Search

Share this with someone who'll enjoy it:

Abstract:Vehicles, as one of the most common and significant objects in the real world, the researches on which using computer vision technologies have made remarkable progress, such as vehicle detection, vehicle re-identification, etc. To search an interested vehicle from the surveillance videos, existing methods first pre-detect and store all vehicle patches, and then apply vehicle re-identification models, which is resource-intensive and not very practical. In this work, we aim to achieve the joint detection and re-identification for vehicle search. However, the conflicting objectives between detection that focuses on shared vehicle commonness and re-identification that focuses on individual vehicle uniqueness make it challenging for a model to learn in an end-to-end system. For this problem, we propose a new unified framework, namely CLIPVehicle, which contains a dual-granularity semantic-region alignment module to leverage the VLMs (Vision-Language Models) for vehicle discrimination modeling, and a multi-level vehicle identification learning strategy to learn the identity representation from global, instance and feature levels. We also construct a new benchmark, including a real-world dataset CityFlowVS, and two synthetic datasets SynVS-Day and SynVS-All, for vehicle search. Extensive experimental results demonstrate that our method outperforms the state-of-the-art methods of both vehicle Re-ID and person search tasks.

View paper on

Share this with someone who'll enjoy it:

Title:CLIPVehicle: A Unified Framework for Vision-based Vehicle Search

Paper and Code