Picture for Haochen Wang

Haochen Wang

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology

Add code
Jul 10, 2025
Viaarxiv icon

Holistic Tokenizer for Autoregressive Image Generation

Add code
Jul 03, 2025
Viaarxiv icon

VGR: Visual Grounded Reasoning

Add code
Jun 16, 2025
Viaarxiv icon

FastMap: Revisiting Dense and Scalable Structure from Motion

Add code
May 07, 2025
Viaarxiv icon

Integrating Learning-Based Manipulation and Physics-Based Locomotion for Whole-Body Badminton Robot Control

Add code
Apr 24, 2025
Viaarxiv icon

The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer

Add code
Apr 14, 2025
Figure 1 for The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
Figure 2 for The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
Figure 3 for The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
Figure 4 for The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
Viaarxiv icon

Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness

Add code
Apr 02, 2025
Viaarxiv icon

DocVideoQA: Towards Comprehensive Understanding of Document-Centric Videos through Question Answering

Add code
Mar 20, 2025
Viaarxiv icon

OpenSatMap: A Fine-grained High-resolution Satellite Dataset for Large-scale Map Construction

Add code
Oct 30, 2024
Viaarxiv icon

Reconstructive Visual Instruction Tuning

Add code
Oct 12, 2024
Viaarxiv icon