Picture for Haoyu Cao

Haoyu Cao

Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction

Add code
Jun 18, 2024
Figure 1 for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction
Figure 2 for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction
Figure 3 for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction
Figure 4 for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction
Viaarxiv icon

HRVDA: High-Resolution Visual Document Assistant

Add code
Apr 10, 2024
Figure 1 for HRVDA: High-Resolution Visual Document Assistant
Figure 2 for HRVDA: High-Resolution Visual Document Assistant
Figure 3 for HRVDA: High-Resolution Visual Document Assistant
Figure 4 for HRVDA: High-Resolution Visual Document Assistant
Viaarxiv icon

Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models

Add code
Feb 29, 2024
Figure 1 for Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Figure 2 for Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Figure 3 for Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Figure 4 for Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Viaarxiv icon

Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration

Add code
Sep 03, 2023
Figure 1 for Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Figure 2 for Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Figure 3 for Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Figure 4 for Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Viaarxiv icon

Turning a CLIP Model into a Scene Text Spotter

Add code
Aug 21, 2023
Figure 1 for Turning a CLIP Model into a Scene Text Spotter
Figure 2 for Turning a CLIP Model into a Scene Text Spotter
Figure 3 for Turning a CLIP Model into a Scene Text Spotter
Figure 4 for Turning a CLIP Model into a Scene Text Spotter
Viaarxiv icon

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

Add code
Jun 05, 2023
Figure 1 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 2 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 3 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Figure 4 for ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Viaarxiv icon

GMN: Generative Multi-modal Network for Practical Document Information Extraction

Add code
Jul 11, 2022
Figure 1 for GMN: Generative Multi-modal Network for Practical Document Information Extraction
Figure 2 for GMN: Generative Multi-modal Network for Practical Document Information Extraction
Figure 3 for GMN: Generative Multi-modal Network for Practical Document Information Extraction
Figure 4 for GMN: Generative Multi-modal Network for Practical Document Information Extraction
Viaarxiv icon

Relational Representation Learning in Visually-Rich Documents

Add code
May 05, 2022
Figure 1 for Relational Representation Learning in Visually-Rich Documents
Figure 2 for Relational Representation Learning in Visually-Rich Documents
Figure 3 for Relational Representation Learning in Visually-Rich Documents
Figure 4 for Relational Representation Learning in Visually-Rich Documents
Viaarxiv icon