Picture for Xie Chen

Xie Chen

On the Effectiveness of Acoustic BPE in Decoder-Only TTS

Add code
Jul 04, 2024
Viaarxiv icon

TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers

Add code
Jun 22, 2024
Figure 1 for TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
Figure 2 for TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
Figure 3 for TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
Figure 4 for TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
Viaarxiv icon

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

Add code
Jun 17, 2024
Viaarxiv icon

AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

Add code
Jun 17, 2024
Figure 1 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Figure 2 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Figure 3 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Figure 4 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Viaarxiv icon

EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark

Add code
Jun 11, 2024
Figure 1 for EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
Figure 2 for EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
Figure 3 for EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
Figure 4 for EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
Viaarxiv icon

The Interspeech 2024 Challenge on Speech Processing Using Discrete Units

Add code
Jun 11, 2024
Figure 1 for The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Figure 2 for The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Figure 3 for The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Figure 4 for The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Viaarxiv icon

MaLa-ASR: Multimedia-Assisted LLM-Based ASR

Add code
Jun 09, 2024
Figure 1 for MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Figure 2 for MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Figure 3 for MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Figure 4 for MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Viaarxiv icon

LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR

Add code
Jun 07, 2024
Viaarxiv icon

1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem

Add code
May 30, 2024
Viaarxiv icon

AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding

Add code
May 06, 2024
Figure 1 for AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
Figure 2 for AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
Figure 3 for AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
Figure 4 for AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
Viaarxiv icon