Picture for Yu Dai

Yu Dai

From Pixels to Paths: A Multi-Agent Framework for Editable Scientific Illustration

Add code
Oct 31, 2025
Viaarxiv icon

Dialogue as Discovery: Navigating Human Intent Through Principled Inquiry

Add code
Oct 31, 2025
Viaarxiv icon

A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation

Add code
Jun 11, 2025
Viaarxiv icon

ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy

Add code
Mar 09, 2025
Figure 1 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Figure 2 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Figure 3 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Figure 4 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Viaarxiv icon

Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection

Add code
Jan 28, 2025
Viaarxiv icon

PSFHS Challenge Report: Pubic Symphysis and Fetal Head Segmentation from Intrapartum Ultrasound Images

Add code
Sep 17, 2024
Figure 1 for PSFHS Challenge Report: Pubic Symphysis and Fetal Head Segmentation from Intrapartum Ultrasound Images
Figure 2 for PSFHS Challenge Report: Pubic Symphysis and Fetal Head Segmentation from Intrapartum Ultrasound Images
Figure 3 for PSFHS Challenge Report: Pubic Symphysis and Fetal Head Segmentation from Intrapartum Ultrasound Images
Figure 4 for PSFHS Challenge Report: Pubic Symphysis and Fetal Head Segmentation from Intrapartum Ultrasound Images
Viaarxiv icon

VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-Tuning

Add code
Jun 20, 2024
Figure 1 for VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-Tuning
Figure 2 for VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-Tuning
Figure 3 for VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-Tuning
Figure 4 for VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-Tuning
Viaarxiv icon

Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental Learning

Add code
Apr 01, 2024
Figure 1 for Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental Learning
Figure 2 for Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental Learning
Figure 3 for Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental Learning
Figure 4 for Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental Learning
Viaarxiv icon

BAAF: A Benchmark Attention Adaptive Framework for Medical Ultrasound Image Segmentation Tasks

Add code
Oct 02, 2023
Viaarxiv icon

Towards Continual Egocentric Activity Recognition: A Multi-modal Egocentric Activity Dataset for Continual Learning

Add code
Jan 26, 2023
Figure 1 for Towards Continual Egocentric Activity Recognition: A Multi-modal Egocentric Activity Dataset for Continual Learning
Figure 2 for Towards Continual Egocentric Activity Recognition: A Multi-modal Egocentric Activity Dataset for Continual Learning
Figure 3 for Towards Continual Egocentric Activity Recognition: A Multi-modal Egocentric Activity Dataset for Continual Learning
Figure 4 for Towards Continual Egocentric Activity Recognition: A Multi-modal Egocentric Activity Dataset for Continual Learning
Viaarxiv icon