Picture for Kiyoharu Aizawa

Kiyoharu Aizawa

Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion

Add code
Apr 24, 2024
Viaarxiv icon

Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models

Add code
Mar 29, 2024
Figure 1 for Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models
Figure 2 for Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models
Figure 3 for Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models
Figure 4 for Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models
Viaarxiv icon

Entity-NeRF: Detecting and Removing Moving Entities in Urban Scenes

Add code
Mar 24, 2024
Viaarxiv icon

Cross-Lingual Learning in Multilingual Scene Text Recognition

Add code
Dec 17, 2023
Viaarxiv icon

Semantic-Driven Initial Image Construction for Guided Image Synthesis in Diffusion Model

Add code
Dec 13, 2023
Viaarxiv icon

Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation

Add code
Nov 22, 2023
Figure 1 for Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Figure 2 for Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Figure 3 for Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Figure 4 for Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Viaarxiv icon

Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?

Add code
Oct 12, 2023
Figure 1 for Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?
Figure 2 for Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?
Figure 3 for Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?
Figure 4 for Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?
Viaarxiv icon

Open-Set Domain Adaptation with Visual-Language Foundation Models

Add code
Jul 30, 2023
Figure 1 for Open-Set Domain Adaptation with Visual-Language Foundation Models
Figure 2 for Open-Set Domain Adaptation with Visual-Language Foundation Models
Figure 3 for Open-Set Domain Adaptation with Visual-Language Foundation Models
Figure 4 for Open-Set Domain Adaptation with Visual-Language Foundation Models
Viaarxiv icon

Manga109Dialog A Large-scale Dialogue Dataset for Comics Speaker Detection

Add code
Jun 30, 2023
Figure 1 for Manga109Dialog A Large-scale Dialogue Dataset for Comics Speaker Detection
Figure 2 for Manga109Dialog A Large-scale Dialogue Dataset for Comics Speaker Detection
Figure 3 for Manga109Dialog A Large-scale Dialogue Dataset for Comics Speaker Detection
Figure 4 for Manga109Dialog A Large-scale Dialogue Dataset for Comics Speaker Detection
Viaarxiv icon

LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning

Add code
Jun 10, 2023
Figure 1 for LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning
Figure 2 for LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning
Figure 3 for LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning
Figure 4 for LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning
Viaarxiv icon