Picture for Yuliang Liu

Yuliang Liu

DocThinker: Explainable Multimodal Large Language Models with Rule-based Reinforcement Learning for Document Understanding

Add code
Aug 12, 2025
Viaarxiv icon

Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle

Add code
Aug 07, 2025
Viaarxiv icon

MSTAR: Box-free Multi-query Scene Text Retrieval with Attention Recycling

Add code
Jun 12, 2025
Viaarxiv icon

MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm

Add code
Jun 05, 2025
Viaarxiv icon

TokBench: Evaluating Your Visual Tokenizer before Visual Generation

Add code
May 26, 2025
Viaarxiv icon

WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?

Add code
May 16, 2025
Viaarxiv icon

SlimPipe: Memory-Thrifty and Efficient Pipeline Parallelism for Long-Context LLM Training

Add code
Apr 20, 2025
Viaarxiv icon

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting

Add code
Apr 14, 2025
Viaarxiv icon

Privacy-Preserving Biometric Verification with Handwritten Random Digit String

Add code
Mar 17, 2025
Viaarxiv icon

OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models

Add code
Feb 22, 2025
Viaarxiv icon