Alert button

"speech": models, code, and papers
Alert button

How "open" are the conversations with open-domain chatbots? A proposal for Speech Event based evaluation

Nov 24, 2022
A. Seza Doğruöz, Gabriel Skantze

Figure 1 for How "open" are the conversations with open-domain chatbots? A proposal for Speech Event based evaluation
Viaarxiv icon

Cross-Modal Mutual Learning for Cued Speech Recognition

Dec 02, 2022
Lei Liu, Li Liu

Figure 1 for Cross-Modal Mutual Learning for Cued Speech Recognition
Figure 2 for Cross-Modal Mutual Learning for Cued Speech Recognition
Figure 3 for Cross-Modal Mutual Learning for Cued Speech Recognition
Figure 4 for Cross-Modal Mutual Learning for Cued Speech Recognition
Viaarxiv icon

Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses

Add code
Bookmark button
Alert button
Nov 29, 2022
Yang Ai, Zhen-Hua Ling

Figure 1 for Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses
Figure 2 for Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses
Figure 3 for Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses
Figure 4 for Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses
Viaarxiv icon

An ASR-free Fluency Scoring Approach with Self-Supervised Learning

Mar 13, 2023
Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

Figure 1 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Figure 2 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Figure 3 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Figure 4 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Viaarxiv icon

Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems

Add code
Bookmark button
Alert button
Feb 15, 2023
Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Guinan Li, Shujie Hu, Xunying Liu

Figure 1 for Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Figure 2 for Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Figure 3 for Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Figure 4 for Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Viaarxiv icon

An Adapter based Multi-label Pre-training for Speech Separation and Enhancement

Nov 11, 2022
Tianrui Wang, Xie Chen, Zhuo Chen, Shu Yu, Weibin Zhu

Figure 1 for An Adapter based Multi-label Pre-training for Speech Separation and Enhancement
Figure 2 for An Adapter based Multi-label Pre-training for Speech Separation and Enhancement
Figure 3 for An Adapter based Multi-label Pre-training for Speech Separation and Enhancement
Figure 4 for An Adapter based Multi-label Pre-training for Speech Separation and Enhancement
Viaarxiv icon

EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance

Add code
Bookmark button
Alert button
Nov 17, 2022
Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu

Figure 1 for EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Figure 2 for EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Figure 3 for EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Figure 4 for EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Viaarxiv icon

Code-Switching without Switching: Language Agnostic End-to-End Speech Translation

Add code
Bookmark button
Alert button
Oct 04, 2022
Christian Huber, Enes Yavuz Ugan, Alexander Waibel

Figure 1 for Code-Switching without Switching: Language Agnostic End-to-End Speech Translation
Figure 2 for Code-Switching without Switching: Language Agnostic End-to-End Speech Translation
Figure 3 for Code-Switching without Switching: Language Agnostic End-to-End Speech Translation
Figure 4 for Code-Switching without Switching: Language Agnostic End-to-End Speech Translation
Viaarxiv icon

Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System

May 18, 2023
Xian Shi, Haoneng Luo, Zhifu Gao, Shiliang Zhang, Zhijie Yan

Figure 1 for Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System
Figure 2 for Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System
Figure 3 for Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System
Figure 4 for Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System
Viaarxiv icon

Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

Add code
Bookmark button
Alert button
Oct 20, 2022
Sarina Meyer, Pascal Tilli, Pavel Denisov, Florian Lux, Julia Koch, Ngoc Thang Vu

Figure 1 for Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Figure 2 for Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Figure 3 for Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Figure 4 for Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Viaarxiv icon