Alert button

"speech": models, code, and papers
Alert button

Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier

Feb 12, 2021
Guillaume Carbajal, Julius Richter, Timo Gerkmann

Figure 1 for Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier
Figure 2 for Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier
Figure 3 for Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier
Figure 4 for Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier
Viaarxiv icon

Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data

Add code
Bookmark button
Alert button
Apr 11, 2022
Vishal Sunder, Prashant Serai, Eric Fosler-Lussier

Figure 1 for Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data
Figure 2 for Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data
Figure 3 for Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data
Figure 4 for Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data
Viaarxiv icon

Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation

Add code
Bookmark button
Alert button
Oct 31, 2022
Liyong Guo, Xiaoyu Yang, Quandong Wang, Yuxiang Kong, Zengwei Yao, Fan Cui, Fangjun Kuang, Wei Kang, Long Lin, Mingshuang Luo, Piotr Zelasko, Daniel Povey

Figure 1 for Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Figure 2 for Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Figure 3 for Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Figure 4 for Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Viaarxiv icon

Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network

Add code
Bookmark button
Alert button
Apr 07, 2021
Jee-weon Jung, Hee-Soo Heo, Youngki Kwon, Joon Son Chung, Bong-Jin Lee

Figure 1 for Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network
Figure 2 for Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network
Figure 3 for Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network
Figure 4 for Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network
Viaarxiv icon

Improving Speech Recognition Accuracy of Local POI Using Geographical Models

Jul 07, 2021
Songjun Cao, Yike Zhang, Xiaobing Feng, Long Ma

Figure 1 for Improving Speech Recognition Accuracy of Local POI Using Geographical Models
Figure 2 for Improving Speech Recognition Accuracy of Local POI Using Geographical Models
Figure 3 for Improving Speech Recognition Accuracy of Local POI Using Geographical Models
Figure 4 for Improving Speech Recognition Accuracy of Local POI Using Geographical Models
Viaarxiv icon

Continuous Speech Separation with Ad Hoc Microphone Arrays

Mar 03, 2021
Dongmei Wang, Takuya Yoshioka, Zhuo Chen, Xiaofei Wang, Tianyan Zhou, Zhong Meng

Figure 1 for Continuous Speech Separation with Ad Hoc Microphone Arrays
Figure 2 for Continuous Speech Separation with Ad Hoc Microphone Arrays
Figure 3 for Continuous Speech Separation with Ad Hoc Microphone Arrays
Figure 4 for Continuous Speech Separation with Ad Hoc Microphone Arrays
Viaarxiv icon

HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation

Add code
Bookmark button
Alert button
Oct 26, 2022
Chunhui Wang, Chang Zeng, Xing He

Figure 1 for HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Figure 2 for HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Figure 3 for HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Figure 4 for HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Viaarxiv icon

Time-domain Speech Enhancement with Generative Adversarial Learning

Add code
Bookmark button
Alert button
Mar 30, 2021
Feiyang Xiao, Jian Guan, Qiuqiang Kong, Wenwu Wang

Figure 1 for Time-domain Speech Enhancement with Generative Adversarial Learning
Figure 2 for Time-domain Speech Enhancement with Generative Adversarial Learning
Viaarxiv icon

Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations

Mar 28, 2022
Sneha Das, Nicole Nadine Lønfeldt, Anne Katrine Pagsberg, Line H. Clemmensen

Figure 1 for Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations
Figure 2 for Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations
Figure 3 for Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations
Figure 4 for Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations
Viaarxiv icon

Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody

Jun 29, 2022
Peter Makarov, Ammar Abbas, Mateusz Łajszczak, Arnaud Joly, Sri Karlapati, Alexis Moinet, Thomas Drugman, Penny Karanasou

Figure 1 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Figure 2 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Figure 3 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Figure 4 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Viaarxiv icon