Alert button

"speech": models, code, and papers
Alert button

Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech

Add code
Bookmark button
Alert button
Nov 07, 2021
Sung-Feng Huang, Chyi-Jiunn Lin, Hung-yi Lee

Figure 1 for Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Figure 2 for Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Figure 3 for Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Figure 4 for Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Viaarxiv icon

FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration

Add code
Bookmark button
Alert button
Nov 09, 2022
Yangjun Wu, Kebin Fang, Yao Zhao, Hao Zhang, Lifeng Shi, Mengqi Zhang

Figure 1 for FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration
Figure 2 for FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration
Figure 3 for FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration
Figure 4 for FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration
Viaarxiv icon

A Diffeomorphic Flow-based Variational Framework for Multi-speaker Emotion Conversion

Nov 09, 2022
Ravi Shankar, Hsi-Wei Hsieh, Nicolas Charon, Archana Venkataraman

Figure 1 for A Diffeomorphic Flow-based Variational Framework for Multi-speaker Emotion Conversion
Figure 2 for A Diffeomorphic Flow-based Variational Framework for Multi-speaker Emotion Conversion
Figure 3 for A Diffeomorphic Flow-based Variational Framework for Multi-speaker Emotion Conversion
Figure 4 for A Diffeomorphic Flow-based Variational Framework for Multi-speaker Emotion Conversion
Viaarxiv icon

Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy

Nov 20, 2021
Si-Ioi Ng, Rui-Si Ma, Tan Lee, Raymond Kim-Wai Sum

Figure 1 for Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy
Figure 2 for Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy
Figure 3 for Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy
Figure 4 for Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy
Viaarxiv icon

A Conformer Based Acoustic Model for Robust Automatic Speech Recognition

Mar 01, 2022
Yufeng Yang, Peidong Wang, DeLiang Wang

Figure 1 for A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Figure 2 for A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Figure 3 for A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Figure 4 for A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Viaarxiv icon

Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis

Add code
Bookmark button
Alert button
Nov 19, 2021
Alexandra Vioni, Myrsini Christidou, Nikolaos Ellinas, Georgios Vamvoukakis, Panos Kakoulidis, Taehoon Kim, June Sig Sung, Hyoungmin Park, Aimilios Chalamandaris, Pirros Tsiakoulis

Figure 1 for Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Figure 2 for Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Figure 3 for Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Figure 4 for Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Viaarxiv icon

Multi-accent Speech Separation with One Shot Learning

Jun 28, 2021
Kuan-Po Huang, Yuan-Kuei Wu, Hung-yi Lee

Figure 1 for Multi-accent Speech Separation with One Shot Learning
Figure 2 for Multi-accent Speech Separation with One Shot Learning
Figure 3 for Multi-accent Speech Separation with One Shot Learning
Figure 4 for Multi-accent Speech Separation with One Shot Learning
Viaarxiv icon

Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models

Add code
Bookmark button
Alert button
Oct 12, 2021
Ryosuke Sawata, Yosuke Kashiwagi, Shusuke Takahashi

Figure 1 for Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models
Figure 2 for Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models
Figure 3 for Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models
Viaarxiv icon

ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS

Add code
Bookmark button
Alert button
Sep 14, 2022
Liumeng Xue, Frank K. Soong, Shaofei Zhang, Lei Xie

Figure 1 for ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Figure 2 for ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Figure 3 for ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Figure 4 for ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Viaarxiv icon

Restoring degraded speech via a modified diffusion model

Apr 22, 2021
Jianwei Zhang, Suren Jayasuriya, Visar Berisha

Figure 1 for Restoring degraded speech via a modified diffusion model
Figure 2 for Restoring degraded speech via a modified diffusion model
Figure 3 for Restoring degraded speech via a modified diffusion model
Figure 4 for Restoring degraded speech via a modified diffusion model
Viaarxiv icon