Picture for Yuyin Zhou

Yuyin Zhou

MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs

Add code
Oct 29, 2025
Viaarxiv icon

GauSSmart: Enhanced 3D Reconstruction through 2D Foundation Models and Geometric Filtering

Add code
Oct 16, 2025
Viaarxiv icon

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Add code
Jul 28, 2025
Figure 1 for GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Figure 2 for GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Figure 3 for GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Figure 4 for GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Viaarxiv icon

A Survey on Latent Reasoning

Add code
Jul 08, 2025
Figure 1 for A Survey on Latent Reasoning
Figure 2 for A Survey on Latent Reasoning
Figure 3 for A Survey on Latent Reasoning
Figure 4 for A Survey on Latent Reasoning
Viaarxiv icon

FedVLMBench: Benchmarking Federated Fine-Tuning of Vision-Language Models

Add code
Jun 11, 2025
Viaarxiv icon

ATR-Bench: A Federated Learning Benchmark for Adaptation, Trust, and Reasoning

Add code
May 22, 2025
Viaarxiv icon

Harnessing EHRs for Diffusion-based Anomaly Detection on Chest X-rays

Add code
May 22, 2025
Viaarxiv icon

MedFrameQA: A Multi-Image Medical VQA Benchmark for Clinical Reasoning

Add code
May 22, 2025
Viaarxiv icon

$\texttt{Complex-Edit}$: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark

Add code
Apr 17, 2025
Viaarxiv icon

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

Add code
Apr 10, 2025
Figure 1 for SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Figure 2 for SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Figure 3 for SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Figure 4 for SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Viaarxiv icon