Image Retrieval


TACOcc:Target-Adaptive Cross-Modal Fusion with Volume Rendering for 3D Semantic Occupancy

Add code
May 19, 2025
Viaarxiv icon

Cross-Lingual Representation Alignment Through Contrastive Image-Caption Tuning

Add code
May 19, 2025
Viaarxiv icon

GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization

Add code
May 19, 2025
Viaarxiv icon

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

Add code
May 19, 2025
Viaarxiv icon

Redundancy-Aware Pretraining of Vision-Language Foundation Models in Remote Sensing

Add code
May 16, 2025
Viaarxiv icon

Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking

Add code
May 19, 2025
Viaarxiv icon

MIRACL-VISION: A Large, multilingual, visual document retrieval benchmark

Add code
May 16, 2025
Viaarxiv icon

GeoVLM: Improving Automated Vehicle Geolocalisation Using Vision-Language Matching

Add code
May 19, 2025
Viaarxiv icon

Enhancing Multi-Image Question Answering via Submodular Subset Selection

Add code
May 15, 2025
Viaarxiv icon

Towards Cross-modal Retrieval in Chinese Cultural Heritage Documents: Dataset and Solution

Add code
May 16, 2025
Viaarxiv icon