Picture for Yu Zeng

Yu Zeng

Corresponding author

VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Add code
May 28, 2025
Viaarxiv icon

Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Add code
May 05, 2025
Viaarxiv icon

Learning Universal Features for Generalizable Image Forgery Localization

Add code
Apr 10, 2025
Viaarxiv icon

VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning

Add code
Apr 10, 2025
Viaarxiv icon

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

Add code
Mar 18, 2025
Viaarxiv icon

Cosmos World Foundation Model Platform for Physical AI

Add code
Jan 07, 2025
Figure 1 for Cosmos World Foundation Model Platform for Physical AI
Figure 2 for Cosmos World Foundation Model Platform for Physical AI
Figure 3 for Cosmos World Foundation Model Platform for Physical AI
Figure 4 for Cosmos World Foundation Model Platform for Physical AI
Viaarxiv icon

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Add code
Nov 11, 2024
Viaarxiv icon

HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion

Add code
Oct 29, 2024
Viaarxiv icon

One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation

Add code
Oct 28, 2024
Figure 1 for One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation
Figure 2 for One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation
Figure 3 for One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation
Figure 4 for One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation
Viaarxiv icon

JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation

Add code
Jul 08, 2024
Figure 1 for JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Figure 2 for JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Figure 3 for JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Figure 4 for JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Viaarxiv icon