Picture for Kiyoharu Aizawa

Kiyoharu Aizawa

JMMMU-Pro: Image-based Japanese Multi-discipline Multimodal Understanding Benchmark via Vibe Benchmark Construction

Add code
Dec 16, 2025
Viaarxiv icon

FoodLogAthl-218: Constructing a Real-World Food Image Dataset Using Dietary Management Applications

Add code
Dec 16, 2025
Viaarxiv icon

Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper

Add code
Nov 06, 2025
Viaarxiv icon

A Highly Clean Recipe Dataset with Ingredient States Annotation for State Probing Task

Add code
Jul 23, 2025
Viaarxiv icon

PULSE: Practical Evaluation Scenarios for Large Multimodal Model Unlearning

Add code
Jul 02, 2025
Viaarxiv icon

MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding

Add code
May 26, 2025
Viaarxiv icon

Harnessing PDF Data for Improving Japanese Large Multimodal Models

Add code
Feb 20, 2025
Figure 1 for Harnessing PDF Data for Improving Japanese Large Multimodal Models
Figure 2 for Harnessing PDF Data for Improving Japanese Large Multimodal Models
Figure 3 for Harnessing PDF Data for Improving Japanese Large Multimodal Models
Figure 4 for Harnessing PDF Data for Improving Japanese Large Multimodal Models
Viaarxiv icon

A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models

Add code
Jan 30, 2025
Figure 1 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Figure 2 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Figure 3 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Figure 4 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Viaarxiv icon

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Add code
Oct 22, 2024
Figure 1 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Figure 2 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Figure 3 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Figure 4 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Viaarxiv icon

FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation

Add code
Sep 27, 2024
Figure 1 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Figure 2 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Figure 3 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Figure 4 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Viaarxiv icon