Picture for Ziyang Miao

Ziyang Miao

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Add code
Apr 06, 2026
Viaarxiv icon

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Add code
Sep 26, 2025
Viaarxiv icon

Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models

Add code
Jun 15, 2025
Figure 1 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Figure 2 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Figure 3 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Figure 4 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Viaarxiv icon

Lost-in-the-Middle in Long-Text Generation: Synthetic Dataset, Evaluation Framework, and Mitigation

Add code
Mar 10, 2025
Viaarxiv icon

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Add code
Jan 09, 2025
Figure 1 for OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Figure 2 for OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Figure 3 for OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Figure 4 for OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Viaarxiv icon