Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Zero-Shot Visual Grounding in 3D Gaussians via View Retrieval

Sep 19, 2025

Liwei Liao, Xufeng Li, Xiaoyun Zheng, Boning Liu, Feng Gao, Ronggang Wang

Figure 1 for Zero-Shot Visual Grounding in 3D Gaussians via View Retrieval

Figure 2 for Zero-Shot Visual Grounding in 3D Gaussians via View Retrieval

Figure 3 for Zero-Shot Visual Grounding in 3D Gaussians via View Retrieval

Share this with someone who'll enjoy it:

Abstract:3D Visual Grounding (3DVG) aims to locate objects in 3D scenes based on text prompts, which is essential for applications such as robotics. However, existing 3DVG methods encounter two main challenges: first, they struggle to handle the implicit representation of spatial textures in 3D Gaussian Splatting (3DGS), making per-scene training indispensable; second, they typically require larges amounts of labeled data for effective training. To this end, we propose \underline{G}rounding via \underline{V}iew \underline{R}etrieval (GVR), a novel zero-shot visual grounding framework for 3DGS to transform 3DVG as a 2D retrieval task that leverages object-level view retrieval to collect grounding clues from multiple views, which not only avoids the costly process of 3D annotation, but also eliminates the need for per-scene training. Extensive experiments demonstrate that our method achieves state-of-the-art visual grounding performance while avoiding per-scene training, providing a solid foundation for zero-shot 3DVG research. Video demos can be found in https://github.com/leviome/GVR_demos.

View paper on

Share this with someone who'll enjoy it:

Title:Zero-Shot Visual Grounding in 3D Gaussians via View Retrieval

Paper and Code