Picture for Zhou Zhao

Zhou Zhao

SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer

Add code
Sep 04, 2025
Figure 1 for SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
Figure 2 for SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
Figure 3 for SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
Figure 4 for SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
Viaarxiv icon

OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use

Add code
Aug 06, 2025
Figure 1 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Figure 2 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Figure 3 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Figure 4 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Viaarxiv icon

EC-Diff: Fast and High-Quality Edge-Cloud Collaborative Inference for Diffusion Models

Add code
Jul 16, 2025
Viaarxiv icon

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation

Add code
Jul 09, 2025
Viaarxiv icon

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Add code
Jun 26, 2025
Figure 1 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Figure 2 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Figure 3 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Figure 4 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Viaarxiv icon

GenSpace: Benchmarking Spatially-Aware Image Generation

Add code
May 30, 2025
Figure 1 for GenSpace: Benchmarking Spatially-Aware Image Generation
Figure 2 for GenSpace: Benchmarking Spatially-Aware Image Generation
Figure 3 for GenSpace: Benchmarking Spatially-Aware Image Generation
Figure 4 for GenSpace: Benchmarking Spatially-Aware Image Generation
Viaarxiv icon

IRBridge: Solving Image Restoration Bridge with Pre-trained Generative Diffusion Models

Add code
May 30, 2025
Viaarxiv icon

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis

Add code
May 20, 2025
Figure 1 for TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
Figure 2 for TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
Figure 3 for TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
Figure 4 for TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
Viaarxiv icon

Depth Anything with Any Prior

Add code
May 15, 2025
Viaarxiv icon

T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback

Add code
May 15, 2025
Figure 1 for T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Figure 2 for T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Figure 3 for T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Figure 4 for T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Viaarxiv icon