Picture for Zehan Wang

Zehan Wang

GenSpace: Benchmarking Spatially-Aware Image Generation

Add code
May 30, 2025
Viaarxiv icon

T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback

Add code
May 15, 2025
Viaarxiv icon

Depth Anything with Any Prior

Add code
May 15, 2025
Viaarxiv icon

Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision

Add code
Apr 30, 2025
Viaarxiv icon

RoboGround: Robotic Manipulation with Grounded Vision-Language Priors

Add code
Apr 30, 2025
Viaarxiv icon

Unleashing the Power of Natural Audio Featuring Multiple Sound Sources

Add code
Apr 24, 2025
Viaarxiv icon

EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration

Add code
Feb 20, 2025
Viaarxiv icon

OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios

Add code
Jan 02, 2025
Figure 1 for OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Figure 2 for OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Figure 3 for OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Figure 4 for OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Viaarxiv icon

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

Add code
Dec 24, 2024
Viaarxiv icon

Rate-Distortion Optimized Skip Coding of Region Adaptive Hierarchical Transform Coefficients for MPEG G-PCC

Add code
Dec 07, 2024
Viaarxiv icon