Object Detection


Object detection is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside image classification and retrieval.

Performance Optimization of YOLO-FEDER FusionNet for Robust Drone Detection in Visually Complex Environments

Add code
Sep 17, 2025
Figure 1 for Performance Optimization of YOLO-FEDER FusionNet for Robust Drone Detection in Visually Complex Environments
Figure 2 for Performance Optimization of YOLO-FEDER FusionNet for Robust Drone Detection in Visually Complex Environments
Figure 3 for Performance Optimization of YOLO-FEDER FusionNet for Robust Drone Detection in Visually Complex Environments
Figure 4 for Performance Optimization of YOLO-FEDER FusionNet for Robust Drone Detection in Visually Complex Environments
Viaarxiv icon

Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs

Add code
Oct 02, 2025
Figure 1 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Figure 2 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Figure 3 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Figure 4 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Viaarxiv icon

Modeling the Multivariate Relationship with Contextualized Representations for Effective Human-Object Interaction Detection

Add code
Sep 16, 2025
Viaarxiv icon

Data Augmentation via Latent Diffusion Models for Detecting Smell-Related Objects in Historical Artworks

Add code
Sep 18, 2025
Viaarxiv icon

InsFusion: Rethink Instance-level LiDAR-Camera Fusion for 3D Object Detection

Add code
Sep 10, 2025
Viaarxiv icon

RT-DETR++ for UAV Object Detection

Add code
Sep 11, 2025
Viaarxiv icon

Analytic Conditions for Differentiable Collision Detection in Trajectory Optimization

Add code
Sep 30, 2025
Viaarxiv icon

MuFFIN: Multifaceted Pronunciation Feedback Model with Interactive Hierarchical Neural Modeling

Add code
Oct 06, 2025
Viaarxiv icon

UNIV: Unified Foundation Model for Infrared and Visible Modalities

Add code
Sep 19, 2025
Viaarxiv icon

Explicit Multimodal Graph Modeling for Human-Object Interaction Detection

Add code
Sep 16, 2025
Viaarxiv icon