photo


GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization

Add code
Nov 19, 2025
Viaarxiv icon

PFAvatar: Pose-Fusion 3D Personalized Avatar Reconstruction from Real-World Outfit-of-the-Day Photos

Add code
Nov 18, 2025
Viaarxiv icon

Talk, Snap, Complain: Validation-Aware Multimodal Expert Framework for Fine-Grained Customer Grievances

Add code
Nov 18, 2025
Viaarxiv icon

Scalable Vision-Guided Crop Yield Estimation

Add code
Nov 17, 2025
Viaarxiv icon

Fragile by Design: On the Limits of Adversarial Defenses in Personalized Generation

Add code
Nov 13, 2025
Viaarxiv icon

Beyond Cosine Similarity Magnitude-Aware CLIP for No-Reference Image Quality Assessment

Add code
Nov 13, 2025
Viaarxiv icon

PlantTraitNet: An Uncertainty-Aware Multimodal Framework for Global-Scale Plant Trait Inference from Citizen Science Data

Add code
Nov 10, 2025
Viaarxiv icon

Photo Dating by Facial Age Aggregation

Add code
Nov 07, 2025
Viaarxiv icon

SARCH: Multimodal Search for Archaeological Archives

Add code
Nov 07, 2025
Viaarxiv icon

AI based signage classification for linguistic landscape studies

Add code
Oct 27, 2025
Viaarxiv icon