Picture for Tengchao Lv

Tengchao Lv

TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering

Add code
Nov 28, 2023
Figure 1 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Figure 2 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Figure 3 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Figure 4 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Viaarxiv icon

Kosmos-2.5: A Multimodal Literate Model

Add code
Sep 20, 2023
Figure 1 for Kosmos-2.5: A Multimodal Literate Model
Figure 2 for Kosmos-2.5: A Multimodal Literate Model
Figure 3 for Kosmos-2.5: A Multimodal Literate Model
Figure 4 for Kosmos-2.5: A Multimodal Literate Model
Viaarxiv icon

TextDiffuser: Diffusion Models as Text Painters

Add code
May 24, 2023
Figure 1 for TextDiffuser: Diffusion Models as Text Painters
Figure 2 for TextDiffuser: Diffusion Models as Text Painters
Figure 3 for TextDiffuser: Diffusion Models as Text Painters
Figure 4 for TextDiffuser: Diffusion Models as Text Painters
Viaarxiv icon

Language Is Not All You Need: Aligning Perception with Language Models

Add code
Mar 01, 2023
Figure 1 for Language Is Not All You Need: Aligning Perception with Language Models
Figure 2 for Language Is Not All You Need: Aligning Perception with Language Models
Figure 3 for Language Is Not All You Need: Aligning Perception with Language Models
Figure 4 for Language Is Not All You Need: Aligning Perception with Language Models
Viaarxiv icon

XDoc: Unified Pre-training for Cross-Format Document Understanding

Add code
Oct 06, 2022
Figure 1 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 2 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 3 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 4 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Viaarxiv icon

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

Add code
Apr 19, 2022
Figure 1 for LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Figure 2 for LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Figure 3 for LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Figure 4 for LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Viaarxiv icon

DiT: Self-supervised Pre-training for Document Image Transformer

Add code
Apr 12, 2022
Figure 1 for DiT: Self-supervised Pre-training for Document Image Transformer
Figure 2 for DiT: Self-supervised Pre-training for Document Image Transformer
Figure 3 for DiT: Self-supervised Pre-training for Document Image Transformer
Figure 4 for DiT: Self-supervised Pre-training for Document Image Transformer
Viaarxiv icon

Document AI: Benchmarks, Models and Applications

Add code
Nov 16, 2021
Figure 1 for Document AI: Benchmarks, Models and Applications
Figure 2 for Document AI: Benchmarks, Models and Applications
Figure 3 for Document AI: Benchmarks, Models and Applications
Figure 4 for Document AI: Benchmarks, Models and Applications
Viaarxiv icon

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models

Add code
Sep 25, 2021
Figure 1 for TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Figure 2 for TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Figure 3 for TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Figure 4 for TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Viaarxiv icon

VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization

Add code
Jun 10, 2021
Figure 1 for VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization
Figure 2 for VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization
Figure 3 for VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization
Figure 4 for VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization
Viaarxiv icon