Alert button

"speech": models, code, and papers
Alert button

Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models

Add code
Bookmark button
Alert button
Oct 28, 2022
Ramon Sanabria, Hao Tang, Sharon Goldwater

Figure 1 for Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Figure 2 for Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Figure 3 for Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Figure 4 for Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Viaarxiv icon

CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition

Add code
Bookmark button
Alert button
Jan 11, 2022
Wenliang Dai, Samuel Cahyawijaya, Tiezheng Yu, Elham J. Barezi, Peng Xu, Cheuk Tung Shadow Yiu, Rita Frieske, Holy Lovenia, Genta Indra Winata, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

Figure 1 for CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition
Figure 2 for CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition
Figure 3 for CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition
Figure 4 for CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition
Viaarxiv icon

Measuring Cognitive Status from Speech in a Smart Home Environment

Oct 18, 2021
Kathleen C. Fraser, Majid Komeili

Figure 1 for Measuring Cognitive Status from Speech in a Smart Home Environment
Figure 2 for Measuring Cognitive Status from Speech in a Smart Home Environment
Figure 3 for Measuring Cognitive Status from Speech in a Smart Home Environment
Viaarxiv icon

Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech

Add code
Bookmark button
Alert button
Nov 07, 2021
Sung-Feng Huang, Chyi-Jiunn Lin, Hung-yi Lee

Figure 1 for Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Figure 2 for Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Figure 3 for Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Figure 4 for Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Viaarxiv icon

The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines

Add code
Bookmark button
Alert button
Aug 17, 2022
Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee, Yonghong Yan

Figure 1 for The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
Figure 2 for The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
Figure 3 for The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
Figure 4 for The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
Viaarxiv icon

THUEE system description for NIST 2020 SRE CTS challenge

Oct 12, 2022
Yu Zheng, Jinghan Peng, Miao Zhao, Yufeng Ma, Min Liu, Xinyue Ma, Tianyu Liang, Tianlong Kong, Liang He, Minqiang Xu

Figure 1 for THUEE system description for NIST 2020 SRE CTS challenge
Figure 2 for THUEE system description for NIST 2020 SRE CTS challenge
Figure 3 for THUEE system description for NIST 2020 SRE CTS challenge
Figure 4 for THUEE system description for NIST 2020 SRE CTS challenge
Viaarxiv icon

Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models

Add code
Bookmark button
Alert button
Oct 12, 2021
Ryosuke Sawata, Yosuke Kashiwagi, Shusuke Takahashi

Figure 1 for Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models
Figure 2 for Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models
Figure 3 for Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models
Viaarxiv icon

Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy

Nov 20, 2021
Si-Ioi Ng, Rui-Si Ma, Tan Lee, Raymond Kim-Wai Sum

Figure 1 for Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy
Figure 2 for Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy
Figure 3 for Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy
Figure 4 for Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy
Viaarxiv icon

Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis

Add code
Bookmark button
Alert button
Nov 19, 2021
Alexandra Vioni, Myrsini Christidou, Nikolaos Ellinas, Georgios Vamvoukakis, Panos Kakoulidis, Taehoon Kim, June Sig Sung, Hyoungmin Park, Aimilios Chalamandaris, Pirros Tsiakoulis

Figure 1 for Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Figure 2 for Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Figure 3 for Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Figure 4 for Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Viaarxiv icon

A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS

Add code
Bookmark button
Alert button
Sep 22, 2022
Haohan Guo, Fenglong Xie, Frank K. Soong, Xixin Wu, Helen Meng

Figure 1 for A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Figure 2 for A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Figure 3 for A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Figure 4 for A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Viaarxiv icon