Picture for Zhengyuan Yang

Zhengyuan Yang

SceneCode: Executable World Programs for Editable Indoor Scenes with Articulated Objects

Add code
May 19, 2026
Viaarxiv icon

TextGround4M: A Prompt-Aligned Dataset for Layout-Aware Text Rendering

Add code
Apr 27, 2026
Viaarxiv icon

MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation

Add code
Apr 16, 2026
Viaarxiv icon

FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching

Add code
Apr 08, 2026
Viaarxiv icon

RAGEN-2: Reasoning Collapse in Agentic RL

Add code
Apr 07, 2026
Viaarxiv icon

BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation

Add code
Mar 26, 2026
Viaarxiv icon

RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents

Add code
Feb 02, 2026
Viaarxiv icon

ProImage-Bench: Rubric-Based Evaluation for Professional Image Generation

Add code
Dec 13, 2025
Viaarxiv icon

Computer-Use Agents as Judges for Generative User Interface

Add code
Nov 19, 2025
Viaarxiv icon

SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models

Add code
Oct 08, 2025
Figure 1 for SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Figure 2 for SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Figure 3 for SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Figure 4 for SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Viaarxiv icon