Alert button
Picture for Xiang Bai

Xiang Bai

Alert button

Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering

Add code
Bookmark button
Alert button
May 21, 2024
Hiba Maryam, Ling Fu, Jiajun Song, Tajrian ABM Shafayet, Qidi Luo, Xiang Bai, Yuliang Liu

Viaarxiv icon

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

Add code
Bookmark button
Alert button
May 20, 2024
Jingqun Tang, Qi Liu, Yongjie Ye, Jinghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao Liu, Xiang Bai, Can Huang

Viaarxiv icon

The First Swahili Language Scene Text Detection and Recognition Dataset

Add code
Bookmark button
Alert button
May 19, 2024
Fadila Wendigoundi Douamba, Jianjun Song, Ling Fu, Yuliang Liu, Xiang Bai

Viaarxiv icon

Exploring the Capabilities of Large Multimodal Models on Dense Text

Add code
Bookmark button
Alert button
May 09, 2024
Shuo Zhang, Biao Yang, Zhang Li, Zhiyin Ma, Yuliang Liu, Xiang Bai

Viaarxiv icon

VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization

Add code
Bookmark button
Alert button
Apr 30, 2024
Yuliang Liu, Mingxin Huang, Hao Yan, Linger Deng, Weijia Wu, Hao Lu, Chunhua Shen, Lianwen Jin, Xiang Bai

Viaarxiv icon

TextSquare: Scaling up Text-Centric Visual Instruction Tuning

Add code
Bookmark button
Alert button
Apr 19, 2024
Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao Liu, Yuan Xie, Xiang Bai, Can Huang

Figure 1 for TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Figure 2 for TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Figure 3 for TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Figure 4 for TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Viaarxiv icon

Bridging the Gap Between End-to-End and Two-Step Text Spotting

Add code
Bookmark button
Alert button
Apr 06, 2024
Mingxin Huang, Hongliang Li, Yuliang Liu, Xiang Bai, Lianwen Jin

Viaarxiv icon

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

Add code
Bookmark button
Alert button
Apr 04, 2024
Zijie Wu, Chaohui Yu, Yanqin Jiang, Chenjie Cao, Fan Wang, Xiang Bai

Viaarxiv icon

OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

Add code
Bookmark button
Alert button
Mar 28, 2024
Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yang

Figure 1 for OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Figure 2 for OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Figure 3 for OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Figure 4 for OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Viaarxiv icon

PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model

Add code
Bookmark button
Alert button
Mar 21, 2024
Zheng Zhang, Yeyao Ma, Enming Zhang, Xiang Bai

Figure 1 for PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
Figure 2 for PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
Figure 3 for PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
Figure 4 for PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
Viaarxiv icon