Picture for Dimosthenis Karatzas

Dimosthenis Karatzas

ComiCap: A VLMs pipeline for dense captioning of Comic Panels

Add code
Sep 24, 2024
Viaarxiv icon

One missing piece in Vision and Language: A Survey on Comics Understanding

Add code
Sep 14, 2024
Figure 1 for One missing piece in Vision and Language: A Survey on Comics Understanding
Figure 2 for One missing piece in Vision and Language: A Survey on Comics Understanding
Figure 3 for One missing piece in Vision and Language: A Survey on Comics Understanding
Figure 4 for One missing piece in Vision and Language: A Survey on Comics Understanding
Viaarxiv icon

GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models

Add code
Aug 14, 2024
Figure 1 for GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models
Figure 2 for GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models
Figure 3 for GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models
Figure 4 for GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models
Viaarxiv icon

Image-text matching for large-scale book collections

Add code
Jul 29, 2024
Figure 1 for Image-text matching for large-scale book collections
Figure 2 for Image-text matching for large-scale book collections
Figure 3 for Image-text matching for large-scale book collections
Figure 4 for Image-text matching for large-scale book collections
Viaarxiv icon

CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding

Add code
Jul 04, 2024
Figure 1 for CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding
Figure 2 for CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding
Figure 3 for CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding
Figure 4 for CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding
Viaarxiv icon

Comics Datasets Framework: Mix of Comics datasets for detection benchmarking

Add code
Jul 03, 2024
Figure 1 for Comics Datasets Framework: Mix of Comics datasets for detection benchmarking
Figure 2 for Comics Datasets Framework: Mix of Comics datasets for detection benchmarking
Figure 3 for Comics Datasets Framework: Mix of Comics datasets for detection benchmarking
Figure 4 for Comics Datasets Framework: Mix of Comics datasets for detection benchmarking
Viaarxiv icon

Federated Document Visual Question Answering: A Pilot Study

Add code
May 10, 2024
Figure 1 for Federated Document Visual Question Answering: A Pilot Study
Figure 2 for Federated Document Visual Question Answering: A Pilot Study
Figure 3 for Federated Document Visual Question Answering: A Pilot Study
Figure 4 for Federated Document Visual Question Answering: A Pilot Study
Viaarxiv icon

Machine Unlearning for Document Classification

Add code
Apr 29, 2024
Figure 1 for Machine Unlearning for Document Classification
Figure 2 for Machine Unlearning for Document Classification
Figure 3 for Machine Unlearning for Document Classification
Figure 4 for Machine Unlearning for Document Classification
Viaarxiv icon

Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism

Add code
Apr 29, 2024
Figure 1 for Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism
Figure 2 for Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism
Figure 3 for Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism
Figure 4 for Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism
Viaarxiv icon

Multimodal Transformer for Comics Text-Cloze

Add code
Mar 06, 2024
Figure 1 for Multimodal Transformer for Comics Text-Cloze
Figure 2 for Multimodal Transformer for Comics Text-Cloze
Figure 3 for Multimodal Transformer for Comics Text-Cloze
Figure 4 for Multimodal Transformer for Comics Text-Cloze
Viaarxiv icon