Alert button
Picture for Yuliang Liu

Yuliang Liu

Alert button

Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering

Add code
Bookmark button
Alert button
May 21, 2024
Hiba Maryam, Ling Fu, Jiajun Song, Tajrian ABM Shafayet, Qidi Luo, Xiang Bai, Yuliang Liu

Viaarxiv icon

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

Add code
Bookmark button
Alert button
May 20, 2024
Jingqun Tang, Qi Liu, Yongjie Ye, Jinghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao Liu, Xiang Bai, Can Huang

Viaarxiv icon

The First Swahili Language Scene Text Detection and Recognition Dataset

Add code
Bookmark button
Alert button
May 19, 2024
Fadila Wendigoundi Douamba, Jianjun Song, Ling Fu, Yuliang Liu, Xiang Bai

Viaarxiv icon

Exploring the Capabilities of Large Multimodal Models on Dense Text

Add code
Bookmark button
Alert button
May 09, 2024
Shuo Zhang, Biao Yang, Zhang Li, Zhiyin Ma, Yuliang Liu, Xiang Bai

Viaarxiv icon

VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization

Add code
Bookmark button
Alert button
Apr 30, 2024
Yuliang Liu, Mingxin Huang, Hao Yan, Linger Deng, Weijia Wu, Hao Lu, Chunhua Shen, Lianwen Jin, Xiang Bai

Viaarxiv icon

TextSquare: Scaling up Text-Centric Visual Instruction Tuning

Add code
Bookmark button
Alert button
Apr 19, 2024
Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao Liu, Yuan Xie, Xiang Bai, Can Huang

Figure 1 for TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Figure 2 for TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Figure 3 for TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Figure 4 for TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Viaarxiv icon

Bridging the Gap Between End-to-End and Two-Step Text Spotting

Add code
Bookmark button
Alert button
Apr 06, 2024
Mingxin Huang, Hongliang Li, Yuliang Liu, Xiang Bai, Lianwen Jin

Viaarxiv icon

OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

Add code
Bookmark button
Alert button
Mar 28, 2024
Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yang

Figure 1 for OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Figure 2 for OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Figure 3 for OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Figure 4 for OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Viaarxiv icon

TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document

Add code
Bookmark button
Alert button
Mar 15, 2024
Yuliang Liu, Biao Yang, Qiang Liu, Zhang Li, Zhiyin Ma, Shuo Zhang, Xiang Bai

Figure 1 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Figure 2 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Figure 3 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Figure 4 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Viaarxiv icon

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

Add code
Bookmark button
Alert button
Feb 06, 2024
Yang Jin, Zhicheng Sun, Kun Xu, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang Song, Kun Gai, Yadong Mu

Viaarxiv icon