Alert button
Picture for Haoyu Cao

Haoyu Cao

Alert button

Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models

Feb 29, 2024
Xin Li, Yunfei Wu, Xinghua Jiang, Zhihao Guo, Mingming Gong, Haoyu Cao, Yinsong Liu, Deqiang Jiang, Xing Sun

Viaarxiv icon

Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration

Sep 03, 2023
Haoyu Cao, Changcun Bao, Chaohu Liu, Huang Chen, Kun Yin, Hao Liu, Yinsong Liu, Deqiang Jiang, Xing Sun

Figure 1 for Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Figure 2 for Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Figure 3 for Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Figure 4 for Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Viaarxiv icon

Turning a CLIP Model into a Scene Text Spotter

Aug 21, 2023
Wenwen Yu, Yuliang Liu, Xingkui Zhu, Haoyu Cao, Xing Sun, Xiang Bai

Figure 1 for Turning a CLIP Model into a Scene Text Spotter
Figure 2 for Turning a CLIP Model into a Scene Text Spotter
Figure 3 for Turning a CLIP Model into a Scene Text Spotter
Figure 4 for Turning a CLIP Model into a Scene Text Spotter
Viaarxiv icon

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

Jun 05, 2023
Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, Mingyu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai

Figure 1 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 2 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 3 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 4 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Viaarxiv icon

GMN: Generative Multi-modal Network for Practical Document Information Extraction

Jul 11, 2022
Haoyu Cao, Jiefeng Ma, Antai Guo, Yiqing Hu, Hao Liu, Deqiang Jiang, Yinsong Liu, Bo Ren

Figure 1 for GMN: Generative Multi-modal Network for Practical Document Information Extraction
Figure 2 for GMN: Generative Multi-modal Network for Practical Document Information Extraction
Figure 3 for GMN: Generative Multi-modal Network for Practical Document Information Extraction
Figure 4 for GMN: Generative Multi-modal Network for Practical Document Information Extraction
Viaarxiv icon

Relational Representation Learning in Visually-Rich Documents

May 05, 2022
Xin Li, Yan Zheng, Yiqing Hu, Haoyu Cao, Yunfei Wu, Deqiang Jiang, Yinsong Liu, Bo Ren

Figure 1 for Relational Representation Learning in Visually-Rich Documents
Figure 2 for Relational Representation Learning in Visually-Rich Documents
Figure 3 for Relational Representation Learning in Visually-Rich Documents
Figure 4 for Relational Representation Learning in Visually-Rich Documents
Viaarxiv icon