Picture for Haizhou Li

Haizhou Li

Multi-Scale Accent Modeling with Disentangling for Multi-Speaker Multi-Accent TTS Synthesis

Add code
Jun 16, 2024
Figure 1 for Multi-Scale Accent Modeling with Disentangling for Multi-Speaker Multi-Accent TTS Synthesis
Figure 2 for Multi-Scale Accent Modeling with Disentangling for Multi-Speaker Multi-Accent TTS Synthesis
Figure 3 for Multi-Scale Accent Modeling with Disentangling for Multi-Speaker Multi-Accent TTS Synthesis
Figure 4 for Multi-Scale Accent Modeling with Disentangling for Multi-Speaker Multi-Accent TTS Synthesis
Viaarxiv icon

ED-sKWS: Early-Decision Spiking Neural Networks for Rapid,and Energy-Efficient Keyword Spotting

Add code
Jun 14, 2024
Viaarxiv icon

Target Speech Diarization with Multimodal Prompts

Add code
Jun 11, 2024
Viaarxiv icon

Autoregressive Diffusion Transformer for Text-to-Speech Synthesis

Add code
Jun 08, 2024
Viaarxiv icon

How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?

Add code
Jun 04, 2024
Viaarxiv icon

Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation

Add code
Jun 03, 2024
Figure 1 for Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation
Figure 2 for Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation
Figure 3 for Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation
Figure 4 for Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation
Viaarxiv icon

TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models

Add code
May 30, 2024
Figure 1 for TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Figure 2 for TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Figure 3 for TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Figure 4 for TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Viaarxiv icon

Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models

Add code
May 23, 2024
Figure 1 for Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
Figure 2 for Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
Figure 3 for Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
Figure 4 for Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
Viaarxiv icon

Mamba in Speech: Towards an Alternative to Self-Attention

Add code
May 22, 2024
Figure 1 for Mamba in Speech: Towards an Alternative to Self-Attention
Figure 2 for Mamba in Speech: Towards an Alternative to Self-Attention
Figure 3 for Mamba in Speech: Towards an Alternative to Self-Attention
Figure 4 for Mamba in Speech: Towards an Alternative to Self-Attention
Viaarxiv icon

Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis

Add code
May 15, 2024
Figure 1 for Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
Figure 2 for Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
Figure 3 for Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
Figure 4 for Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
Viaarxiv icon