Object Detection


Object detection is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside image classification and retrieval.

RecurGS: Interactive Scene Modeling via Discrete-State Recurrent Gaussian Fusion

Add code
Dec 20, 2025
Viaarxiv icon

Physics-Inspired Modeling and Content Adaptive Routing in an Infrared Gas Leak Detection Network

Add code
Dec 29, 2025
Viaarxiv icon

FocalComm: Hard Instance-Aware Multi-Agent Perception

Add code
Dec 20, 2025
Figure 1 for FocalComm: Hard Instance-Aware Multi-Agent Perception
Figure 2 for FocalComm: Hard Instance-Aware Multi-Agent Perception
Figure 3 for FocalComm: Hard Instance-Aware Multi-Agent Perception
Figure 4 for FocalComm: Hard Instance-Aware Multi-Agent Perception
Viaarxiv icon

ReasonCD: A Multimodal Reasoning Large Model for Implicit Change-of-Interest Semantic Mining

Add code
Dec 22, 2025
Figure 1 for ReasonCD: A Multimodal Reasoning Large Model for Implicit Change-of-Interest Semantic Mining
Figure 2 for ReasonCD: A Multimodal Reasoning Large Model for Implicit Change-of-Interest Semantic Mining
Figure 3 for ReasonCD: A Multimodal Reasoning Large Model for Implicit Change-of-Interest Semantic Mining
Figure 4 for ReasonCD: A Multimodal Reasoning Large Model for Implicit Change-of-Interest Semantic Mining
Viaarxiv icon

CoDi -- an exemplar-conditioned diffusion model for low-shot counting

Add code
Dec 23, 2025
Viaarxiv icon

LogicLens: Visual-Logical Co-Reasoning for Text-Centric Forgery Analysis

Add code
Dec 25, 2025
Viaarxiv icon

E-RGB-D: Real-Time Event-Based Perception with Structured Light

Add code
Dec 20, 2025
Figure 1 for E-RGB-D: Real-Time Event-Based Perception with Structured Light
Figure 2 for E-RGB-D: Real-Time Event-Based Perception with Structured Light
Figure 3 for E-RGB-D: Real-Time Event-Based Perception with Structured Light
Figure 4 for E-RGB-D: Real-Time Event-Based Perception with Structured Light
Viaarxiv icon

PEDESTRIAN: An Egocentric Vision Dataset for Obstacle Detection on Pavements

Add code
Dec 22, 2025
Viaarxiv icon

Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning

Add code
Dec 22, 2025
Viaarxiv icon

A Dataset and Benchmarks for Atrial Fibrillation Detection from Electrocardiograms of Intensive Care Unit Patients

Add code
Dec 19, 2025
Viaarxiv icon