Picture for Yuliang Liu

Yuliang Liu

TokBench: Evaluating Your Visual Tokenizer before Visual Generation

Add code
May 26, 2025
Viaarxiv icon

WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?

Add code
May 16, 2025
Viaarxiv icon

SlimPipe: Memory-Thrifty and Efficient Pipeline Parallelism for Long-Context LLM Training

Add code
Apr 20, 2025
Viaarxiv icon

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting

Add code
Apr 14, 2025
Viaarxiv icon

Privacy-Preserving Biometric Verification with Handwritten Random Digit String

Add code
Mar 17, 2025
Viaarxiv icon

OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models

Add code
Feb 22, 2025
Viaarxiv icon

AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence

Add code
Feb 19, 2025
Viaarxiv icon

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning

Add code
Dec 31, 2024
Figure 1 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 2 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 3 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 4 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Viaarxiv icon

Liquid: Language Models are Scalable Multi-modal Generators

Add code
Dec 05, 2024
Figure 1 for Liquid: Language Models are Scalable Multi-modal Generators
Figure 2 for Liquid: Language Models are Scalable Multi-modal Generators
Figure 3 for Liquid: Language Models are Scalable Multi-modal Generators
Figure 4 for Liquid: Language Models are Scalable Multi-modal Generators
Viaarxiv icon

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

Add code
Dec 03, 2024
Viaarxiv icon