Picture for Yuliang Liu

Yuliang Liu

BigMac: Breaking the Pareto Frontier of Compute and Memory in Multimodal LLM Training

Add code
May 25, 2026
Viaarxiv icon

DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding

Add code
Apr 14, 2026
Viaarxiv icon

Heddle: A Distributed Orchestration System for Agentic RL Rollout

Add code
Mar 30, 2026
Viaarxiv icon

MDPBench: A Benchmark for Multilingual Document Parsing in Real-World Scenarios

Add code
Mar 30, 2026
Viaarxiv icon

Multimodal OCR: Parse Anything from Documents

Add code
Mar 13, 2026
Viaarxiv icon

Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously

Add code
Mar 12, 2026
Viaarxiv icon

ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding

Add code
Feb 26, 2026
Viaarxiv icon

TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering

Add code
Feb 26, 2026
Viaarxiv icon

Next Concept Prediction in Discrete Latent Space Leads to Stronger Language Models

Add code
Feb 09, 2026
Viaarxiv icon

GeoFocus: Blending Efficient Global-to-Local Perception for Multimodal Geometry Problem-Solving

Add code
Feb 09, 2026
Viaarxiv icon