Picture for Ryo Hachiuma

Ryo Hachiuma

VIOLA: Towards Video In-Context Learning with Minimal Annotations

Add code
Jan 22, 2026
Viaarxiv icon

Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception

Add code
Jan 14, 2026
Viaarxiv icon

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Add code
Dec 23, 2025
Viaarxiv icon

4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation

Add code
Dec 22, 2025
Figure 1 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Figure 2 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Figure 3 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Figure 4 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Viaarxiv icon

Zoom-Zero: Reinforced Coarse-to-Fine Video Understanding via Temporal Zoom-in

Add code
Dec 16, 2025
Viaarxiv icon

Unified Reinforcement and Imitation Learning for Vision-Language Models

Add code
Oct 22, 2025
Figure 1 for Unified Reinforcement and Imitation Learning for Vision-Language Models
Figure 2 for Unified Reinforcement and Imitation Learning for Vision-Language Models
Figure 3 for Unified Reinforcement and Imitation Learning for Vision-Language Models
Figure 4 for Unified Reinforcement and Imitation Learning for Vision-Language Models
Viaarxiv icon

Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation

Add code
Sep 09, 2025
Figure 1 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 2 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 3 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 4 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Viaarxiv icon

Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation

Add code
Sep 03, 2025
Figure 1 for Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation
Figure 2 for Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation
Figure 3 for Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation
Figure 4 for Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation
Viaarxiv icon

Autoregressive Universal Video Segmentation Model

Add code
Aug 26, 2025
Figure 1 for Autoregressive Universal Video Segmentation Model
Figure 2 for Autoregressive Universal Video Segmentation Model
Figure 3 for Autoregressive Universal Video Segmentation Model
Figure 4 for Autoregressive Universal Video Segmentation Model
Viaarxiv icon

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

Add code
Jun 18, 2025
Viaarxiv icon