Picture for Mengshi Qi

Mengshi Qi

SGFormer++: Semantic Graph Transformer for Incremental 3D Scene Graph Generation

Add code
Jun 13, 2026
Viaarxiv icon

Leveraging Metric Depth for Relative Depth Prediction

Add code
Jun 09, 2026
Viaarxiv icon

A VideoMAE-v2 Approach to Zero-Shot Traffic Accident Anticipation

Add code
Jun 08, 2026
Viaarxiv icon

Claude Code-Driving Scenario Mining for the Argoverse 2 Challenge

Add code
Jun 08, 2026
Viaarxiv icon

Global-Local Monte Carlo Tree Search in Vision-Language Models for Text-to-3D Indoor Scene Generation

Add code
Jun 04, 2026
Viaarxiv icon

Question-Aware Evidence Ledgers for Video Relational Reasoning

Add code
Jun 01, 2026
Viaarxiv icon

Active Exploring like a Pigeon: Reinforcing Spatial Reasoning via Agentic Vision-Language Models

Add code
Jun 01, 2026
Viaarxiv icon

Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning

Add code
Dec 17, 2025
Figure 1 for Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning
Figure 2 for Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning
Figure 3 for Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning
Figure 4 for Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning
Viaarxiv icon

SoccerNet 2025 Challenges Results

Add code
Aug 26, 2025
Viaarxiv icon

Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization

Add code
Apr 18, 2025
Figure 1 for Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Figure 2 for Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Figure 3 for Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Figure 4 for Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Viaarxiv icon