Scene Graph Generation


Scene graph generation is the process of creating structured representations of scenes that capture the relationships between objects.

A Neuro-Symbolic Framework for Reasoning under Perceptual Uncertainty: Bridging Continuous Perception and Discrete Symbolic Planning

Add code
Nov 18, 2025
Figure 1 for A Neuro-Symbolic Framework for Reasoning under Perceptual Uncertainty: Bridging Continuous Perception and Discrete Symbolic Planning
Figure 2 for A Neuro-Symbolic Framework for Reasoning under Perceptual Uncertainty: Bridging Continuous Perception and Discrete Symbolic Planning
Figure 3 for A Neuro-Symbolic Framework for Reasoning under Perceptual Uncertainty: Bridging Continuous Perception and Discrete Symbolic Planning
Figure 4 for A Neuro-Symbolic Framework for Reasoning under Perceptual Uncertainty: Bridging Continuous Perception and Discrete Symbolic Planning
Viaarxiv icon

HiGS: Hierarchical Generative Scene Framework for Multi-Step Associative Semantic Spatial Composition

Add code
Oct 31, 2025
Viaarxiv icon

Computer Vision based group activity detection and action spotting

Add code
Nov 17, 2025
Figure 1 for Computer Vision based group activity detection and action spotting
Figure 2 for Computer Vision based group activity detection and action spotting
Figure 3 for Computer Vision based group activity detection and action spotting
Figure 4 for Computer Vision based group activity detection and action spotting
Viaarxiv icon

SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards

Add code
Nov 10, 2025
Viaarxiv icon

Enhancing Multimodal Misinformation Detection by Replaying the Whole Story from Image Modality Perspective

Add code
Nov 09, 2025
Figure 1 for Enhancing Multimodal Misinformation Detection by Replaying the Whole Story from Image Modality Perspective
Figure 2 for Enhancing Multimodal Misinformation Detection by Replaying the Whole Story from Image Modality Perspective
Figure 3 for Enhancing Multimodal Misinformation Detection by Replaying the Whole Story from Image Modality Perspective
Figure 4 for Enhancing Multimodal Misinformation Detection by Replaying the Whole Story from Image Modality Perspective
Viaarxiv icon

SILVI: Simple Interface for Labeling Video Interactions

Add code
Nov 05, 2025
Figure 1 for SILVI: Simple Interface for Labeling Video Interactions
Figure 2 for SILVI: Simple Interface for Labeling Video Interactions
Figure 3 for SILVI: Simple Interface for Labeling Video Interactions
Viaarxiv icon

Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics

Add code
Sep 26, 2025
Figure 1 for Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics
Figure 2 for Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics
Figure 3 for Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics
Figure 4 for Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics
Viaarxiv icon

PoSh: Using Scene Graphs To Guide LLMs-as-a-Judge For Detailed Image Descriptions

Add code
Oct 21, 2025
Viaarxiv icon

MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning

Add code
Sep 26, 2025
Figure 1 for MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
Figure 2 for MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
Figure 3 for MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
Figure 4 for MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
Viaarxiv icon

Causal Reasoning Elicits Controllable 3D Scene Generation

Add code
Sep 18, 2025
Figure 1 for Causal Reasoning Elicits Controllable 3D Scene Generation
Figure 2 for Causal Reasoning Elicits Controllable 3D Scene Generation
Figure 3 for Causal Reasoning Elicits Controllable 3D Scene Generation
Figure 4 for Causal Reasoning Elicits Controllable 3D Scene Generation
Viaarxiv icon