Picture for Kuniaki Saito

Kuniaki Saito

BioVITA: Biological Dataset, Model, and Benchmark for Visual-Textual-Acoustic Alignment

Add code
Mar 25, 2026
Viaarxiv icon

HalDec-Bench: Benchmarking Hallucination Detector in Image Captioning

Add code
Mar 16, 2026
Viaarxiv icon

Where-to-Unmask: Ground-Truth-Guided Unmasking Order Learning for Masked Diffusion Language Models

Add code
Feb 10, 2026
Viaarxiv icon

Towards Safer Mobile Agents: Scalable Generation and Evaluation of Diverse Scenarios for VLMs

Add code
Jan 13, 2026
Viaarxiv icon

CaptionSmiths: Flexibly Controlling Language Pattern in Image Captioning

Add code
Jul 02, 2025
Viaarxiv icon

SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images

Add code
Dec 23, 2024
Figure 1 for SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images
Figure 2 for SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images
Figure 3 for SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images
Figure 4 for SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images
Viaarxiv icon

Is Large-Scale Pretraining the Secret to Good Domain Generalization?

Add code
Dec 03, 2024
Viaarxiv icon

Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection

Add code
Jul 21, 2024
Figure 1 for Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection
Figure 2 for Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection
Figure 3 for Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection
Figure 4 for Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection
Viaarxiv icon

Unsupervised LLM Adaptation for Question Answering

Add code
Feb 16, 2024
Figure 1 for Unsupervised LLM Adaptation for Question Answering
Figure 2 for Unsupervised LLM Adaptation for Question Answering
Figure 3 for Unsupervised LLM Adaptation for Question Answering
Figure 4 for Unsupervised LLM Adaptation for Question Answering
Viaarxiv icon

ERM++: An Improved Baseline for Domain Generalization

Add code
Apr 04, 2023
Figure 1 for ERM++: An Improved Baseline for Domain Generalization
Figure 2 for ERM++: An Improved Baseline for Domain Generalization
Figure 3 for ERM++: An Improved Baseline for Domain Generalization
Figure 4 for ERM++: An Improved Baseline for Domain Generalization
Viaarxiv icon