Picture for Haoyuan Li

Haoyuan Li

Let's Reward Step-by-Step: Step-Aware Contrastive Alignment for Vision-Language Navigation in Continuous Environments

Add code
Mar 10, 2026
Viaarxiv icon

Thinking with Geometry: Active Geometry Integration for Spatial Reasoning

Add code
Feb 05, 2026
Viaarxiv icon

Beyond Precision: Training-Inference Mismatch is an Optimization Problem and Simple LR Scheduling Fixes It

Add code
Feb 02, 2026
Viaarxiv icon

MAU-GPT: Enhancing Multi-type Industrial Anomaly Understanding via Anomaly-aware and Generalist Experts Adaptation

Add code
Jan 31, 2026
Viaarxiv icon

Unified Personalized Understanding, Generating and Editing

Add code
Jan 11, 2026
Viaarxiv icon

R-Debater: Retrieval-Augmented Debate Generation through Argumentative Memory

Add code
Dec 31, 2025
Viaarxiv icon

Helmsman: Autonomous Synthesis of Federated Learning Systems via Multi-Agent Collaboration

Add code
Oct 16, 2025
Viaarxiv icon

Matrix-3D: Omnidirectional Explorable 3D World Generation

Add code
Aug 11, 2025
Viaarxiv icon

Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling

Add code
Jun 06, 2025
Viaarxiv icon

Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs

Add code
Jun 06, 2025
Figure 1 for Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs
Figure 2 for Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs
Figure 3 for Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs
Figure 4 for Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs
Viaarxiv icon