Scene Classification


Unified Representation Space for 3D Visual Grounding

Add code
Jun 17, 2025
Viaarxiv icon

Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models

Add code
Jun 17, 2025
Viaarxiv icon

Description and Discussion on DCASE 2025 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes

Add code
Jun 12, 2025
Viaarxiv icon

MSSDF: Modality-Shared Self-supervised Distillation for High-Resolution Multi-modal Remote Sensing Image Learning

Add code
Jun 11, 2025
Viaarxiv icon

Evidential Deep Learning with Spectral-Spatial Uncertainty Disentanglement for Open-Set Hyperspectral Domain Generalization

Add code
Jun 11, 2025
Viaarxiv icon

SAMSelect: A Spectral Index Search for Marine Debris Visualization using Segment Anything

Add code
Jun 10, 2025
Viaarxiv icon

Hallucinate, Ground, Repeat: A Framework for Generalized Visual Relationship Detection

Add code
Jun 06, 2025
Viaarxiv icon

HyperPointFormer: Multimodal Fusion in 3D Space with Dual-Branch Cross-Attention Transformers

Add code
May 29, 2025
Viaarxiv icon

Co-AttenDWG: Co-Attentive Dimension-Wise Gating and Expert Fusion for Multi-Modal Offensive Content Detection

Add code
May 25, 2025
Viaarxiv icon

Light as Deception: GPT-driven Natural Relighting Against Vision-Language Pre-training Models

Add code
May 30, 2025
Viaarxiv icon