Fine Grained Image Recognition


Geospatial-Reasoning-Driven Vocabulary-Agnostic Remote Sensing Semantic Segmentation

Add code
Feb 09, 2026
Viaarxiv icon

Fine-Grained Cat Breed Recognition with Global Context Vision Transformer

Add code
Feb 07, 2026
Viaarxiv icon

Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning

Add code
Feb 07, 2026
Viaarxiv icon

Beyond Static Cropping: Layer-Adaptive Visual Localization and Decoding Enhancement

Add code
Feb 04, 2026
Viaarxiv icon

InspecSafe-V1: A Multimodal Benchmark for Safety Assessment in Industrial Inspection Scenarios

Add code
Jan 29, 2026
Viaarxiv icon

DermaBench: A Clinician-Annotated Benchmark Dataset for Dermatology Visual Question Answering and Reasoning

Add code
Jan 20, 2026
Viaarxiv icon

MuseAgent-1: Interactive Grounded Multimodal Understanding of Music Scores and Performance Audio

Add code
Jan 17, 2026
Viaarxiv icon

Diffusion Representations for Fine-Grained Image Classification: A Marine Plankton Case Study

Add code
Jan 19, 2026
Viaarxiv icon

Enhancing Vision Language Models with Logic Reasoning for Situational Awareness

Add code
Jan 16, 2026
Viaarxiv icon

LP-LLM: End-to-End Real-World Degraded License Plate Text Recognition via Large Multimodal Models

Add code
Jan 14, 2026
Viaarxiv icon