Picture for Haoyuan Li

Haoyuan Li

Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs

Add code
Jun 06, 2025
Viaarxiv icon

Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling

Add code
Jun 06, 2025
Viaarxiv icon

Fast-Slow Thinking for Large Vision-Language Model Reasoning

Add code
Apr 25, 2025
Viaarxiv icon

CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation

Add code
Mar 07, 2025
Figure 1 for CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation
Figure 2 for CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation
Figure 3 for CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation
Figure 4 for CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation
Viaarxiv icon

DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian Splatting

Add code
Mar 04, 2025
Viaarxiv icon

MINT: Multi-modal Chain of Thought in Unified Generative Models for Enhanced Image Generation

Add code
Mar 03, 2025
Viaarxiv icon

MLINE-VINS: Robust Monocular Visual-Inertial SLAM With Flow Manhattan and Line Features

Add code
Mar 03, 2025
Viaarxiv icon

UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting

Add code
Feb 25, 2025
Viaarxiv icon

Boosting Private Domain Understanding of Efficient MLLMs: A Tuning-free, Adaptive, Universal Prompt Optimization Framework

Add code
Dec 27, 2024
Viaarxiv icon

Coverage-based Fairness in Multi-document Summarization

Add code
Dec 11, 2024
Viaarxiv icon