Picture for Hejun Dong

Hejun Dong

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Add code
Sep 26, 2025
Viaarxiv icon

Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models

Add code
Jun 15, 2025
Figure 1 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Figure 2 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Figure 3 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Figure 4 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Viaarxiv icon

LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning

Add code
Jun 05, 2025
Viaarxiv icon

LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts

Add code
May 20, 2025
Viaarxiv icon