Alert button
Picture for Cha Zhang

Cha Zhang

Alert button

Kosmos-2.5: A Multimodal Literate Model

Sep 20, 2023
Tengchao Lv, Yupan Huang, Jingye Chen, Lei Cui, Shuming Ma, Yaoyao Chang, Shaohan Huang, Wenhui Wang, Li Dong, Weiyao Luo, Shaoxiang Wu, Guoxin Wang, Cha Zhang, Furu Wei

Figure 1 for Kosmos-2.5: A Multimodal Literate Model
Figure 2 for Kosmos-2.5: A Multimodal Literate Model
Figure 3 for Kosmos-2.5: A Multimodal Literate Model
Figure 4 for Kosmos-2.5: A Multimodal Literate Model
Viaarxiv icon

From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding

May 23, 2023
Li Sun, Florian Luisier, Kayhan Batmanghelich, Dinei Florencio, Cha Zhang

Figure 1 for From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Figure 2 for From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Figure 3 for From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Figure 4 for From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Viaarxiv icon

Diffusion-based Document Layout Generation

Mar 19, 2023
Liu He, Yijuan Lu, John Corring, Dinei Florencio, Cha Zhang

Figure 1 for Diffusion-based Document Layout Generation
Figure 2 for Diffusion-based Document Layout Generation
Figure 3 for Diffusion-based Document Layout Generation
Figure 4 for Diffusion-based Document Layout Generation
Viaarxiv icon

Unifying Vision, Text, and Layout for Universal Document Processing

Dec 20, 2022
Zineng Tang, Ziyi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal

Figure 1 for Unifying Vision, Text, and Layout for Universal Document Processing
Figure 2 for Unifying Vision, Text, and Layout for Universal Document Processing
Figure 3 for Unifying Vision, Text, and Layout for Universal Document Processing
Figure 4 for Unifying Vision, Text, and Layout for Universal Document Processing
Viaarxiv icon

XDoc: Unified Pre-training for Cross-Format Document Understanding

Oct 06, 2022
Jingye Chen, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei

Figure 1 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 2 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 3 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 4 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Viaarxiv icon

Understanding Long Documents with Different Position-Aware Attentions

Aug 17, 2022
Hai Pham, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang

Figure 1 for Understanding Long Documents with Different Position-Aware Attentions
Figure 2 for Understanding Long Documents with Different Position-Aware Attentions
Figure 3 for Understanding Long Documents with Different Position-Aware Attentions
Figure 4 for Understanding Long Documents with Different Position-Aware Attentions
Viaarxiv icon

DiT: Self-supervised Pre-training for Document Image Transformer

Apr 12, 2022
Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei

Figure 1 for DiT: Self-supervised Pre-training for Document Image Transformer
Figure 2 for DiT: Self-supervised Pre-training for Document Image Transformer
Figure 3 for DiT: Self-supervised Pre-training for Document Image Transformer
Figure 4 for DiT: Self-supervised Pre-training for Document Image Transformer
Viaarxiv icon

Improving Structured Text Recognition with Regular Expression Biasing

Nov 10, 2021
Baoguang Shi, Wenfeng Cheng, Yijuan Lu, Cha Zhang, Dinei Florencio

Figure 1 for Improving Structured Text Recognition with Regular Expression Biasing
Figure 2 for Improving Structured Text Recognition with Regular Expression Biasing
Figure 3 for Improving Structured Text Recognition with Regular Expression Biasing
Figure 4 for Improving Structured Text Recognition with Regular Expression Biasing
Viaarxiv icon