Picture for Xuweiyi Chen

Xuweiyi Chen

Next-Embedding Prediction Makes Strong Vision Learners

Add code
Dec 23, 2025
Viaarxiv icon

Empowering Dynamic Urban Navigation with Stereo and Mid-Level Vision

Add code
Dec 11, 2025
Viaarxiv icon

4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time

Add code
Jun 23, 2025
Viaarxiv icon

Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts

Add code
May 29, 2025
Viaarxiv icon

Frame In-N-Out: Unbounded Controllable Image-to-Video Generation

Add code
May 27, 2025
Viaarxiv icon

Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Add code
Nov 25, 2024
Figure 1 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Figure 2 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Figure 3 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Figure 4 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Viaarxiv icon

Open Vocabulary Monocular 3D Object Detection

Add code
Nov 25, 2024
Viaarxiv icon

Learning 3D Representations from Procedural 3D Programs

Add code
Nov 25, 2024
Figure 1 for Learning 3D Representations from Procedural 3D Programs
Figure 2 for Learning 3D Representations from Procedural 3D Programs
Figure 3 for Learning 3D Representations from Procedural 3D Programs
Figure 4 for Learning 3D Representations from Procedural 3D Programs
Viaarxiv icon

Multi-Object Hallucination in Vision-Language Models

Add code
Jul 08, 2024
Figure 1 for Multi-Object Hallucination in Vision-Language Models
Figure 2 for Multi-Object Hallucination in Vision-Language Models
Figure 3 for Multi-Object Hallucination in Vision-Language Models
Figure 4 for Multi-Object Hallucination in Vision-Language Models
Viaarxiv icon

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Add code
Jun 12, 2024
Figure 1 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Figure 2 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Figure 3 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Figure 4 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Viaarxiv icon