Picture for Zhou Zhao

Zhou Zhao

Unleashing the Power of Natural Audio Featuring Multiple Sound Sources

Add code
Apr 24, 2025
Viaarxiv icon

OmniAudio: Generating Spatial Audio from 360-Degree Video

Add code
Apr 21, 2025
Viaarxiv icon

Continual Cross-Modal Generalization

Add code
Apr 01, 2025
Viaarxiv icon

Pathological Prior-Guided Multiple Instance Learning For Mitigating Catastrophic Forgetting in Breast Cancer Whole Slide Image Classification

Add code
Mar 08, 2025
Viaarxiv icon

Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis

Add code
Feb 26, 2025
Viaarxiv icon

CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

Add code
Feb 23, 2025
Viaarxiv icon

EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration

Add code
Feb 20, 2025
Viaarxiv icon

WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models

Add code
Feb 20, 2025
Viaarxiv icon

Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model

Add code
Feb 08, 2025
Figure 1 for Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Figure 2 for Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Figure 3 for Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Figure 4 for Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Viaarxiv icon

Low-rank Prompt Interaction for Continual Vision-Language Retrieval

Add code
Jan 24, 2025
Figure 1 for Low-rank Prompt Interaction for Continual Vision-Language Retrieval
Figure 2 for Low-rank Prompt Interaction for Continual Vision-Language Retrieval
Figure 3 for Low-rank Prompt Interaction for Continual Vision-Language Retrieval
Figure 4 for Low-rank Prompt Interaction for Continual Vision-Language Retrieval
Viaarxiv icon