Picture for Yuliang Liu

Yuliang Liu

DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding

Add code
Apr 14, 2026
Viaarxiv icon

MDPBench: A Benchmark for Multilingual Document Parsing in Real-World Scenarios

Add code
Mar 30, 2026
Viaarxiv icon

Heddle: A Distributed Orchestration System for Agentic RL Rollout

Add code
Mar 30, 2026
Viaarxiv icon

Multimodal OCR: Parse Anything from Documents

Add code
Mar 13, 2026
Viaarxiv icon

Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously

Add code
Mar 12, 2026
Viaarxiv icon

TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering

Add code
Feb 26, 2026
Viaarxiv icon

ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding

Add code
Feb 26, 2026
Viaarxiv icon

GeoFocus: Blending Efficient Global-to-Local Perception for Multimodal Geometry Problem-Solving

Add code
Feb 09, 2026
Viaarxiv icon

Next Concept Prediction in Discrete Latent Space Leads to Stronger Language Models

Add code
Feb 09, 2026
Viaarxiv icon

Kling-Omni Technical Report

Add code
Dec 18, 2025
Figure 1 for Kling-Omni Technical Report
Figure 2 for Kling-Omni Technical Report
Figure 3 for Kling-Omni Technical Report
Figure 4 for Kling-Omni Technical Report
Viaarxiv icon