Picture for Bohan Zeng

Bohan Zeng

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Add code
Dec 18, 2025
Viaarxiv icon

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Add code
Dec 14, 2025
Viaarxiv icon

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Add code
Dec 12, 2025
Viaarxiv icon

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

Add code
Dec 11, 2025
Viaarxiv icon

BRACE: A Benchmark for Robust Audio Caption Quality Evaluation

Add code
Dec 11, 2025
Viaarxiv icon

VABench: A Comprehensive Benchmark for Audio-Video Generation

Add code
Dec 10, 2025
Viaarxiv icon

SciAgent: A Unified Multi-Agent System for Generalistic Scientific Reasoning

Add code
Nov 17, 2025
Viaarxiv icon

Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks

Add code
Oct 22, 2025
Viaarxiv icon

MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning

Add code
Oct 16, 2025
Figure 1 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Figure 2 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Figure 3 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Figure 4 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Viaarxiv icon

Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models

Add code
Jun 15, 2025
Figure 1 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Figure 2 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Figure 3 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Figure 4 for Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models
Viaarxiv icon