Picture for Dimosthenis Karatzas

Dimosthenis Karatzas

Understanding Video Scenes through Text: Insights from Text-based Video Question Answering

Add code
Sep 11, 2023
Viaarxiv icon

STEP -- Towards Structured Scene-Text Spotting

Add code
Sep 05, 2023
Viaarxiv icon

Reading Between the Lanes: Text VideoQA on the Road

Add code
Jul 08, 2023
Viaarxiv icon

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

Add code
Jun 05, 2023
Viaarxiv icon

ICDAR 2023 Competition on Reading the Seal Title

Add code
Apr 24, 2023
Viaarxiv icon

ICDAR 2023 Video Text Reading Competition for Dense and Small Text

Add code
Apr 10, 2023
Figure 1 for ICDAR 2023 Video Text Reading Competition for Dense and Small Text
Figure 2 for ICDAR 2023 Video Text Reading Competition for Dense and Small Text
Figure 3 for ICDAR 2023 Video Text Reading Competition for Dense and Small Text
Figure 4 for ICDAR 2023 Video Text Reading Competition for Dense and Small Text
Viaarxiv icon

DocILE Benchmark for Document Information Localization and Extraction

Add code
Feb 11, 2023
Figure 1 for DocILE Benchmark for Document Information Localization and Extraction
Figure 2 for DocILE Benchmark for Document Information Localization and Extraction
Figure 3 for DocILE Benchmark for Document Information Localization and Extraction
Figure 4 for DocILE Benchmark for Document Information Localization and Extraction
Viaarxiv icon

Hierarchical multimodal transformers for Multi-Page DocVQA

Add code
Dec 07, 2022
Figure 1 for Hierarchical multimodal transformers for Multi-Page DocVQA
Figure 2 for Hierarchical multimodal transformers for Multi-Page DocVQA
Figure 3 for Hierarchical multimodal transformers for Multi-Page DocVQA
Figure 4 for Hierarchical multimodal transformers for Multi-Page DocVQA
Viaarxiv icon

Watching the News: Towards VideoQA Models that can Read

Add code
Nov 10, 2022
Viaarxiv icon

Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia

Add code
Sep 21, 2022
Figure 1 for Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
Figure 2 for Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
Figure 3 for Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
Figure 4 for Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
Viaarxiv icon