Picture for Qingyang Hong

Qingyang Hong

LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation

Add code
Jun 12, 2024
Viaarxiv icon

MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis

Add code
Dec 28, 2023
Viaarxiv icon

ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech

Add code
Sep 29, 2023
Figure 1 for ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Figure 2 for ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Figure 3 for ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Figure 4 for ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Viaarxiv icon

Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization

Add code
Jun 26, 2023
Figure 1 for Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization
Figure 2 for Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization
Figure 3 for Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization
Figure 4 for Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization
Viaarxiv icon

Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge

Add code
Jun 07, 2023
Figure 1 for Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Figure 2 for Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Figure 3 for Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Figure 4 for Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Viaarxiv icon

Towards A Unified Conformer Structure: from ASR to ASV Task

Add code
Nov 14, 2022
Figure 1 for Towards A Unified Conformer Structure: from ASR to ASV Task
Figure 2 for Towards A Unified Conformer Structure: from ASR to ASV Task
Figure 3 for Towards A Unified Conformer Structure: from ASR to ASV Task
Figure 4 for Towards A Unified Conformer Structure: from ASR to ASV Task
Viaarxiv icon

Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting

Add code
Sep 24, 2022
Figure 1 for Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting
Figure 2 for Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting
Figure 3 for Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting
Figure 4 for Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting
Viaarxiv icon

Deep Representation Decomposition for Rate-Invariant Speaker Verification

Add code
May 28, 2022
Figure 1 for Deep Representation Decomposition for Rate-Invariant Speaker Verification
Figure 2 for Deep Representation Decomposition for Rate-Invariant Speaker Verification
Figure 3 for Deep Representation Decomposition for Rate-Invariant Speaker Verification
Viaarxiv icon

Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data

Add code
Apr 25, 2022
Figure 1 for Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data
Figure 2 for Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data
Figure 3 for Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data
Figure 4 for Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data
Viaarxiv icon

The xmuspeech system for multi-channel multi-party meeting transcription challenge

Add code
Feb 11, 2022
Figure 1 for The xmuspeech system for multi-channel multi-party meeting transcription challenge
Figure 2 for The xmuspeech system for multi-channel multi-party meeting transcription challenge
Figure 3 for The xmuspeech system for multi-channel multi-party meeting transcription challenge
Figure 4 for The xmuspeech system for multi-channel multi-party meeting transcription challenge
Viaarxiv icon