Picture for Wenhao Huang

Wenhao Huang

Robust Visual Localization via Semantic-Guided Multi-Scale Transformer

Add code
Jun 10, 2025
Viaarxiv icon

ScaleLong: A Multi-Timescale Benchmark for Long Video Understanding

Add code
May 29, 2025
Viaarxiv icon

MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation

Add code
May 27, 2025
Viaarxiv icon

P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark

Add code
May 21, 2025
Viaarxiv icon

KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation

Add code
May 21, 2025
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Viaarxiv icon

FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models

Add code
May 05, 2025
Viaarxiv icon

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs

Add code
Apr 21, 2025
Viaarxiv icon

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Add code
Apr 07, 2025
Viaarxiv icon

FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis

Add code
Mar 17, 2025
Viaarxiv icon