Picture for Chaofan Ding

Chaofan Ding

R2-SVC: Towards Real-World Robust and Expressive Zero-shot Singing Voice Conversion

Add code
Oct 23, 2025
Viaarxiv icon

Towards Video to Piano Music Generation with Chain-of-Perform Support Benchmarks

Add code
May 26, 2025
Figure 1 for Towards Video to Piano Music Generation with Chain-of-Perform Support Benchmarks
Figure 2 for Towards Video to Piano Music Generation with Chain-of-Perform Support Benchmarks
Viaarxiv icon

MM-MovieDubber: Towards Multi-Modal Learning for Multi-Modal Movie Dubbing

Add code
May 22, 2025
Viaarxiv icon

Towards Film-Making Production Dialogue, Narration, Monologue Adaptive Moving Dubbing Benchmarks

Add code
Apr 30, 2025
Viaarxiv icon

DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning Guidance

Add code
Mar 31, 2025
Viaarxiv icon

DeepSound-V1: Start to Think Step-by-Step in the Audio Generation from Videos

Add code
Mar 28, 2025
Viaarxiv icon

Enhance Generation Quality of Flow Matching V2A Model via Multi-Step CoT-Like Guidance and Combined Preference Optimization

Add code
Mar 28, 2025
Viaarxiv icon

DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation

Add code
Mar 28, 2025
Figure 1 for DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation
Figure 2 for DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation
Figure 3 for DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation
Figure 4 for DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation
Viaarxiv icon

Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search

Add code
Jan 02, 2025
Figure 1 for Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search
Figure 2 for Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search
Figure 3 for Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search
Viaarxiv icon

Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio

Add code
Dec 23, 2024
Figure 1 for Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio
Figure 2 for Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio
Figure 3 for Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio
Figure 4 for Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio
Viaarxiv icon