Picture for Kiyoharu Aizawa

Kiyoharu Aizawa

A Highly Clean Recipe Dataset with Ingredient States Annotation for State Probing Task

Add code
Jul 23, 2025
Viaarxiv icon

PULSE: Practical Evaluation Scenarios for Large Multimodal Model Unlearning

Add code
Jul 02, 2025
Viaarxiv icon

MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding

Add code
May 26, 2025
Viaarxiv icon

Harnessing PDF Data for Improving Japanese Large Multimodal Models

Add code
Feb 20, 2025
Figure 1 for Harnessing PDF Data for Improving Japanese Large Multimodal Models
Figure 2 for Harnessing PDF Data for Improving Japanese Large Multimodal Models
Figure 3 for Harnessing PDF Data for Improving Japanese Large Multimodal Models
Figure 4 for Harnessing PDF Data for Improving Japanese Large Multimodal Models
Viaarxiv icon

A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models

Add code
Jan 30, 2025
Figure 1 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Figure 2 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Figure 3 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Figure 4 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Viaarxiv icon

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Add code
Oct 22, 2024
Figure 1 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Figure 2 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Figure 3 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Figure 4 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Viaarxiv icon

FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation

Add code
Sep 27, 2024
Figure 1 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Figure 2 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Figure 3 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Figure 4 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Viaarxiv icon

Training-Free Sketch-Guided Diffusion with Latent Optimization

Add code
Aug 31, 2024
Figure 1 for Training-Free Sketch-Guided Diffusion with Latent Optimization
Figure 2 for Training-Free Sketch-Guided Diffusion with Latent Optimization
Figure 3 for Training-Free Sketch-Guided Diffusion with Latent Optimization
Figure 4 for Training-Free Sketch-Guided Diffusion with Latent Optimization
Viaarxiv icon

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Add code
Jul 31, 2024
Figure 1 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Figure 2 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Figure 3 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Figure 4 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Viaarxiv icon

MangaUB: A Manga Understanding Benchmark for Large Multimodal Models

Add code
Jul 26, 2024
Viaarxiv icon