Indoor Scene Understanding


Indoor scene understanding is the process of analyzing and interpreting indoor environments from images or videos.

JRDB-Pose3D: A Multi-person 3D Human Pose and Shape Estimation Dataset for Robotics

Add code
Feb 03, 2026
Viaarxiv icon

Enhancing Indoor Occupancy Prediction via Sparse Query-Based Multi-Level Consistent Knowledge Distillation

Add code
Feb 02, 2026
Viaarxiv icon

DSCD-Nav: Dual-Stance Cooperative Debate for Object Navigation

Add code
Jan 29, 2026
Viaarxiv icon

ReScene4D: Temporally Consistent Semantic Instance Segmentation of Evolving Indoor 3D Scenes

Add code
Jan 16, 2026
Viaarxiv icon

3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence

Add code
Jan 10, 2026
Viaarxiv icon

LabelAny3D: Label Any Object 3D in the Wild

Add code
Jan 04, 2026
Viaarxiv icon

Bayesian Monocular Depth Refinement via Neural Radiance Fields

Add code
Jan 07, 2026
Viaarxiv icon

MoniRefer: A Real-world Large-scale Multi-modal Dataset based on Roadside Infrastructure for 3D Visual Grounding

Add code
Dec 31, 2025
Viaarxiv icon

From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs

Add code
Dec 22, 2025
Figure 1 for From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
Figure 2 for From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
Figure 3 for From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
Figure 4 for From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
Viaarxiv icon

$M^3-Verse$: A "Spot the Difference" Challenge for Large Multimodal Models

Add code
Dec 21, 2025
Viaarxiv icon