Alert button
Picture for Xiang Bai

Xiang Bai

Alert button

Looking and Listening: Audio Guided Text Recognition

Jun 06, 2023
Wenwen Yu, Mingyu Liu, Biao Yang, Enming Zhang, Deqiang Jiang, Xing Sun, Yuliang Liu, Xiang Bai

Figure 1 for Looking and Listening: Audio Guided Text Recognition
Figure 2 for Looking and Listening: Audio Guided Text Recognition
Figure 3 for Looking and Listening: Audio Guided Text Recognition
Figure 4 for Looking and Listening: Audio Guided Text Recognition
Viaarxiv icon

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

Jun 05, 2023
Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, Mingyu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai

Figure 1 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 2 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 3 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 4 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Viaarxiv icon

SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model

Jun 04, 2023
Dingyuan Zhang, Dingkang Liang, Hongcheng Yang, Zhikang Zou, Xiaoqing Ye, Zhe Liu, Xiang Bai

Figure 1 for SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model
Figure 2 for SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model
Figure 3 for SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model
Figure 4 for SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model
Viaarxiv icon

On the Hidden Mystery of OCR in Large Multimodal Models

May 13, 2023
Yuliang Liu, Zhang Li, Hongliang Li, Wenwen Yu, Mingxin Huang, Dezhi Peng, Mingyu Liu, Mingrui Chen, Chunyuan Li, Lianwen Jin, Xiang Bai

Figure 1 for On the Hidden Mystery of OCR in Large Multimodal Models
Figure 2 for On the Hidden Mystery of OCR in Large Multimodal Models
Figure 3 for On the Hidden Mystery of OCR in Large Multimodal Models
Viaarxiv icon

Multi-Modal 3D Object Detection by Box Matching

May 12, 2023
Zhe Liu, Xiaoqing Ye, Zhikang Zou, Xinwei He, Xiao Tan, Errui Ding, Jingdong Wang, Xiang Bai

Figure 1 for Multi-Modal 3D Object Detection by Box Matching
Figure 2 for Multi-Modal 3D Object Detection by Box Matching
Figure 3 for Multi-Modal 3D Object Detection by Box Matching
Figure 4 for Multi-Modal 3D Object Detection by Box Matching
Viaarxiv icon

Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution

May 12, 2023
Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren, Yu Zhou, Xiang Bai

Figure 1 for Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution
Figure 2 for Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution
Figure 3 for Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution
Figure 4 for Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution
Viaarxiv icon

A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension

May 05, 2023
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Hong Zhou, Mike Zheng Shou, Xiang Bai

Figure 1 for A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension
Figure 2 for A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension
Figure 3 for A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension
Figure 4 for A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension
Viaarxiv icon

ICDAR 2023 Competition on Reading the Seal Title

Apr 24, 2023
Wenwen Yu, Mingyu Liu, Mingrui Chen, Ning Lu, Yinlong Wen, Yuliang Liu, Dimosthenis Karatzas, Xiang Bai

Figure 1 for ICDAR 2023 Competition on Reading the Seal Title
Figure 2 for ICDAR 2023 Competition on Reading the Seal Title
Figure 3 for ICDAR 2023 Competition on Reading the Seal Title
Figure 4 for ICDAR 2023 Competition on Reading the Seal Title
Viaarxiv icon

SOOD: Towards Semi-Supervised Oriented Object Detection

Apr 10, 2023
Wei Hua, Dingkang Liang, Jingyu Li, Xiaolong Liu, Zhikang Zou, Xiaoqing Ye, Xiang Bai

Figure 1 for SOOD: Towards Semi-Supervised Oriented Object Detection
Figure 2 for SOOD: Towards Semi-Supervised Oriented Object Detection
Figure 3 for SOOD: Towards Semi-Supervised Oriented Object Detection
Figure 4 for SOOD: Towards Semi-Supervised Oriented Object Detection
Viaarxiv icon

ICDAR 2023 Video Text Reading Competition for Dense and Small Text

Apr 10, 2023
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Mike Zheng Shou, Umapada Pal, Dimosthenis Karatzas, Xiang Bai

Figure 1 for ICDAR 2023 Video Text Reading Competition for Dense and Small Text
Figure 2 for ICDAR 2023 Video Text Reading Competition for Dense and Small Text
Figure 3 for ICDAR 2023 Video Text Reading Competition for Dense and Small Text
Figure 4 for ICDAR 2023 Video Text Reading Competition for Dense and Small Text
Viaarxiv icon