Picture for Yu Zeng

Yu Zeng

Corresponding author

VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Add code
May 28, 2025
Viaarxiv icon

Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Add code
May 05, 2025
Viaarxiv icon

VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning

Add code
Apr 10, 2025
Viaarxiv icon

Learning Universal Features for Generalizable Image Forgery Localization

Add code
Apr 10, 2025
Viaarxiv icon

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

Add code
Mar 18, 2025
Viaarxiv icon

Cosmos World Foundation Model Platform for Physical AI

Add code
Jan 07, 2025
Figure 1 for Cosmos World Foundation Model Platform for Physical AI
Figure 2 for Cosmos World Foundation Model Platform for Physical AI
Figure 3 for Cosmos World Foundation Model Platform for Physical AI
Figure 4 for Cosmos World Foundation Model Platform for Physical AI
Viaarxiv icon

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Add code
Nov 11, 2024
Viaarxiv icon

HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion

Add code
Oct 29, 2024
Viaarxiv icon

One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation

Add code
Oct 28, 2024
Figure 1 for One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation
Figure 2 for One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation
Figure 3 for One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation
Figure 4 for One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation
Viaarxiv icon

JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation

Add code
Jul 08, 2024
Figure 1 for JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Figure 2 for JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Figure 3 for JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Figure 4 for JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Viaarxiv icon